CONCRETE 


APPROACH TO 


ALG [2 RA From the Integers to the 
-Insolvability of the Quintic 


JEFFREY BERGEN 


A Concrete Approach to 
Abstract Algebra 


This page intentionally left blank 


A Concrete Approach to 
Abstract Algebra 


From the Integers to the Insolvability of the Quintic 


Jeffrey Bergen 


DePaul University 
Chicago, Illinois 


AMSTERDAM e BOSTON e HEIDELBERG e LONDON 
NEW YORK e OXFORD e PARIS e SAN DIEGO 
SAN FRANCISCO e SINGAPORE e SYDNEY e TOKYO 


<ciet os 
ELSEVIER Academic Press is an imprint of Elsevier 


Academic Press is an imprint of Elsevier 

30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 

525 B Street, Suite 1900, San Diego, California 92101-4495, USA 
84 Theobald’s Road, London WC1X 8RR, UK 


Copyright © 2010 by Elsevier, Inc. All rights reserved. 


No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including 
photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. 
Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with 
organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: 
www.elsevier.com/permissions. 


This book and the individual contributions contained in it are protected under copyright by the Publisher 
(other than as may be noted herein). 


Notices 
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, 
changes in research methods, professional practices, or medical treatment may become necessary. 


Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, 
methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their 
own safety and the safety of others, including parties for whom they have a professional responsibility. 


To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury 
and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of 
any methods, products, instructions, or ideas contained in the material herein. 


Library of Congress Cataloging-in-Publication Data 
Bergen, Jeffrey, 1955. 

A concrete approach to abstract algebra : from the integers to the insolvability of the quintic / Jeffrey Bergen. 

p.cm. 

Includes bibliographical references and index. 

ISBN 978-0-12-37494 1-3 (hard cover : alk. paper) 1. Algebra, Abstract. I. Title. 

QA162.B45 2010 

512’.02-de22 

2009035349 


British Library Cataloguing-in-Publication Data 
A catalogue record for this book is available from the British Library. 


ISBN: 978-0-12-374941-3 


For information on all Academic Press publications 
visit our Web site at www.elsevierdirect.com 


Typeset by: diacriTech, India 


Printed in the United States 
10 11 12 987654321 


Working together to grow 
libraries in developing countries 


www.elsevier.ccom | www.bookaid.org | www.sabre.org 


ELSEVIER BOOKAID — Sabre Foundation 


To Donna 


This page intentionally left blank 


Contents 


PROTACE si5 5s one cee Sete eal aatie a iota id saan e detec ene te Savadeine a tata ohea eae eed eae xi 
ANISCES GUIDE iiiinestss ia venvanasanes@ sels eae awe aes Lhde seta seas aaa sia ead ee xv 
AEKHOWIEU SINCHES. 55 a sisidsa:s wis n'ai s SG. wiae'a:s wis web Saimin sin siis'h alge 86 Single ee Manes ca leidee sien ests xix 
Chapter 1 What This Book Is about and Who This Book Is for ...........00sseeeeeee. 1 
Vid) AGED Ase eon tas cansva gk eeras beneaes udsana sogesieheeecieaw el ateenila thee nates 2 
1.1.1 Finding Roots of Polynomials ......... 0... cece cece cece ence eee e nee eee eee n ees 2 

1.1.2 Existence of Roots of Polynomials .............. 00. cece eee eee e ence eee cree ees 4 

1.1.3. Solving Linear Equations .......... 0... c eee ee cece cece ence enn neeeeeee teen ens 5 

12 GEOMENY savas dec tisecdeerate coe eae Betas ia dee eee 6 
1.2.1 Ruler and Compass Constructions............. 0. cece cece cece ee eeneeeeeeeee ences 6 

1:3. “Ti SOnOmMetry sao Aoi da Sie ae ada en  BE  e 7 
1.3.1 Rational Values of Trigonometric Functions ............ 0... c cece eee ee cece ee ees 7 

ay, PECCALCUIES: ti kara heraderinsatatndet tant tee ied ttad Riad Mae Ines tae tens tale tae 8 
1.4.1 Recognizing Polynomials Using Data......... 00... cece cece eee ence ee ees 8 

MLS M@alGUIMS sce vaker nde t kat a Aout sorter salead tee tengitad tied Nae Sond Ita tied tte atte 10 
1.5.1 Partial Fraction Decomposition........ 00.0... ccc cece cece eect eee eee e enon 10 

1.5.2 Detecting Multiple Roots of Polynomials ............... cece eeeee eee ee ees 12 
Pikercises for Chapter |). 2025 scuvee Gn pomeranian wmadaeets as exiaaweseneeueieents 14 

Chapter 2 Proof and Intuition: c0c2ccd ssina es bec eeaGoen vs oe sags revi ted vee aeosae ees 19 
21 The Well Ordering Principle ss. iicciseeiidile ies ei wie es 20 

2.2: 'Proot by Contragieu om oe 2734694 eo te aw eee ee Rea nea ae 26 

23° Matheniatical WNducHGns <1 .cecoteineieddeatnteddenladesdeesetsttecsetassaees. 29 
Mathematical Induction—First Version ................ cece eee cece eee eee eee 30 
Mathematical Induction—First Version Revisited........................006. GP 
Mathematical Induction—Second Version ................. cece e ee eee eee eee 37 
Exercises for Sections 2.1, 2.2, and 2.3 oo... ccc cece ccc ce cece e cece cee eeeees 37 

2.4 Functions and Binary Operations.............. 0... cece cence cece ee eeneeeeeeeees 46 
Exercises (Or SeCtiOn 24a. .tasccu sci genpndee iets Aoedeed deadewes sedke ee 56 


Vii 


Contents 


Chapter Tie We Srs oc iisis seins ntew eget ees eadiveas cees dues is caves seaewkenu eve 61 
Sail: AP ei IN UIMDEES 55 torre Pape he ae asa tananehs eee asd eaten eae eNe ea cau 61 
Sok. LIMIQUEFAClOFIZ AION a1 c2th cos Leesa Seis aileke Marae dads ceaae dena deedeae canes 64 
55° DIVISION Pel COMTI s pice ts coche ease ete ee Ue Cae kG SER tie GL 67 

Exercises for Sections 3.1, 3.2, and 3.3.0... 0. cc ccc cece cece eee eee e eee eees 71 
3A. - Greatest Common DIvisOrs 3252s socag ce Sed vag syed c aa elas aad daca be 76 
Doo. HMC ean ALBOrtinn 42. co54 ative tik Sas 6 die as ates os Sa tks etic AN 79 
EXERCISES POL SSCLLONS 564 ANG) Digs va tp ene tase og Ge eg 91 

Chapter 4 The Rational Numbers and the Real Numbers ...........0+sceeeeeeseeeees 97 
Aad. Rational INUMABSTS eli icce sede Rod cic Mes oles a ere ds Ouest edie osu edt eG 97 
4.2. Intermetiate: Valuc Wheorent..: 243.02. 254pedade phere 2aGs aoe OP es 105 

Exercises for Sections 4.1 and 4.2...... 0... cece ccc cee e eee e cece ee eeeeeeeeees 113 
a> Bgurvalence RelautOns «ies Tengen Seton anes awe aesetins casas eee ae 118 
EEXCIOISES TOF SECUOM AD 4 osc scuccton.d-tasticsth danbacihaned shasliaedl track tsb dette alain 128 

Chapter 5 The Complex Numbers ..........000ccecccccccccccccccccecccseccsecceees 137 
Dol, VC OMiplex NUMBERS 122 :5-0405 salen eaety toaeaeueseheaeaeack AGS aioe ae 137 
5.2 Fields and Commutative RingS ...............ccceesseeeseccceeeessesneesees 140 

Exercises for Sections 5.1 and 5.2........... ccc cece cccneese cess cceceeeeeeeees 148 
5:2) “Complex Compu Gat OMmiss <i.050Sc5 veegaxiesdnicrs een dict ae con theen eead cee aaah 154 
5.4 Automorphisms and Roots of Polynomials .................. cee eens eee ee ees 163 
Exercises for Sections 5,3.and 5.4 06.36... 50d vadedead sanedend veae seadernn ees 169 
5.5 Groups of Automorphisms of Commutative Rings ...................e eee ee 177 
FEXGICISES FOF SOCLIOIS 2) sac siesa-aisersais’ersalseare ale elounn alse teaeiare date Sale sGcdencgiate Ga 182 

Chapter 6 The Fundamental Theorem of Algebra ..........00.ccceeecccceeneeceees 189 
6.1 Representing Real Numbers and Complex Numbers Geometrically ...... 189 
6.2. ‘Rectangularand Polar Form gees cadsavg hae aeaaaden setetenabaedreewwanyte 199 

Exercises for Sections 6.1 and 6.2... :.2cccc0.se04 casa atesvavssdeesewenieteded 203 
6.3 Demoivre’s Theorem and Roots of Complex Numbers .................... 208 
6.4 A Proof of the Fundamental Theorem of Algebra ......................0005 215 
Exercises for Sections 6,3: and 6A c<.s.¢ 30s sas caen eens ts aewlaw nel cat metebess 222, 

Chapter 7 The Integers Modulo n..........0ccccecccceccccccceeccceccceccesecceees 227 
7.1 Definitions and Basic Properties............. 0. cece cece cece e cnet eee e cece eens 227 
7.2 Zero Divisors and Invertible Elements .................... cece eee e eee e ee eens 233 

Exercises for Sections 731 -and- 72 cie.ccascdecetvigss cose sd stesadee ee ently 241 
io The Bulet gy PunCw Ons 2:20 aiid oh Sah Ss eh Se eee 248 
7.4 Polynomials with Coefficients in Z, 2.2... .. ccc eee ec cee e eee e cece cece eeee 256 
Exercises for Sections 73° and 7A aacsc cased eas tek edea gta ed say sates eee 260 


vill 


Contents 


Chapter'S Group TeOry sus ian s2 estas ed soe cities 06nsndey onesie ev csweevan edi eees 265 
SL .. Definitions and: Example sies.7..5-ss40taesssaSendwasdaekedtsGdea cues oalewecs 265 

I. Commutative Rings and Fields under Addition .....................085 266 

Il. Invertible Elements in Commutative Rings under Multiplication ..... 266 

IIL BijeGtiOns OF SIs sa2.sesdsePasea datas iegaiaes ac nee sede laaced eae eadd ae 267 
ERCTCISES TOL, SECUION Dolio eatin sied say snag cesmnnes eal seamen aes edt Medadnad aed 288 

8.2 Theorems of Lagrange and Sylow................ cece cece eee e eee eeeeeeneeees 294 
PRGICISES TOR SCCUOMO 2 24 vi.ah ysl vadaueeatae yas aseere cheeks eae eue ases 318 

Sor? POlVaDle CMOUPS ch Go takcaherasediberadinaineditentiediseninniten aes tae. 322 
Exercises for SeculON $.3.3cccsacasecessuds rants esev teen ear twewennceeus ees 342 

Oe <6 VIMO TOUS S ysj0ce-d,oxts cine vende cauiedad ayaa cad evaeladueeee tad wma 347 
Exercises for Section.8 Avaiisescddia vend iadeniele teed ele Se ee 361 

Chapter 9 Polynomials over the Integers and Rationals...........00.sceecceecceees 365 
9.1 Integral Domains and Homomorphisms of Rings ...................2e eee 365 
Exercises for Section: 900 jis fasy pady hee ea ddan Sead Heche eh dda hea bs 374 

9.2 Rational Root Test and Irreducible Polynomials........................0005 379 
EXCrcises TOK SECUOND 23. cc ccavioateusacseeeiadenwioiesineboaeeeenaiees 387 

9.3 Gauss’ Lemma and Eisenstein’s Criterion ........... 0... cece cece eee ee eee ee 390 
Exercises for Section: 9.3 tassveriteadeide te a a ek 397 

DA? IR COUCTION WIOUIY os rate sar Sis oeelenenieaaiyureiwe tS gueeatew nme reiewtlea une edanis 398 
Erercises: for Section 9.4... cicsassee aad a ee 408 

Chapter 10 Roots of Polynomials of Degree Less than 5 ..........cceeccceecceeees 411 
10.1 Finding Roots of Polynomials of Small Degree ............................ All 
10.2 A Brief Look at Some Consequences of Galois’ Work ...................4. 418 
Exercises for Sections 10.1 and 10.2 2. ce cate adedaversetameresetabernies 420 

Chapter 11 Rational Values of Trigonometric FUNCTIONS .........+0000ceeeeeeceeees 423 
11.1 Values of Trigonometric Functions ................. 0c cece eee 424 
Exercises Tor secuon 1A. a.c iis peeeceny roan ide oeed ae ianeweouieieaed 433 

Chapter 12 Polynomials over Arbitrary Fields.........0...cceeccceeccccccnccccees 437 
12.1 Similarities between Polynomials and Integers.........................006. 437 
1272 Division AlpOrthiis.¢3c.t cade hede ee leewsusialeber sels bedelabaides 444 
Exercises for Sections: 12.1:and 12.20.50 tice nde es 453 

12.3 Irreducible and Minimum Polynomials ............... 0... ee eeee eee e eee ees 457 
12.4 Euclidean Algorithm and Greatest Common Divisors...................... 460 
Exercises for Sections (2-3 and.1 2:4. c.4 cock sand cane caady eee aes eek 470 

12.5 Formal Derivatives and Multiple Roots ............. 0... c cece eeen eee e eee ees 474 
Exercises tot Séctioi 12.5... bi. cai sete ntel aedetrel secehasl sedan seaeeead oe 484 


Contents 


Chapter 13 Difference Functions and Partial Fractions............00ecceeeeeeceees 487 
L3et) Ditherenee PUN OMS 5 5.002 eadatnsd ae ta eda eas aea cs Naan a wate eRe eeies 488 
13.2 Polynomials and Mathematical Induction...................... cece 499 

Bxercises for sections 135l-andel3.2 52,5: 2s .Gs hey cust a peee eee ea 504 
13.3 Partial Fraction Decomposition................ cece eee eee cece ene eneeenneeees 510 
XELCISES 1Ol SECM LD, Fan one alya¢ aunecamediard ioe eset oan eee 523 
Chapter 14 An Introduction to Linear Algebra and Vector Spaces .........++.++++. 527 
14.1 Examples, Examples, Examples, and a Definition.......................... 527 
xercises fOr Section lA co: oto. ak can ene ain eabsaed euleaeheneseaehine? 538 
14.2 Spanning Sets and Linear Independence .................... cece eee eee eee 540 
14'S: Basissand: Dnienslones 3 545 isa 5 ean tag o Pewee wo ease Pane eee ee 548 
Exercises for Sections 14.2 and 14.3 .......... cece ccc e eee e eee e cece eeneeeees 555 
14.4 Subspaces and Linear Equations .............. 2... cece cece cece cece eeeee 560 
Exercises for Secon: TAA oo. 2c yotendeedatindadadindtend setece arene denonnes 568 
Chapter 15 Degrees and Galois Groups of Field Extensions ...........+.eeeeeeeeee 573 
15.1 Deerees ot Pield: PX(ensions 02622245 S220 tea See Pe week eed 573 
Exercises for Section (5b acs cen.aeddcsnndssde snisaasdosaaseaweteeeesen hae 590 
15;2 Simple Extensions: ¢.ijacen5 fase vast dd av aegsdes ask tee el ele ea 594 
15.3 Splitting Fields and Their Galois Groups .................. cece eee eee eee 599 
Exercises for Sections 15.2and.15.3 vicssavvseed cacessacenascuaeeeecan staan 615 

Chapter 16 Geometric Constructions .........00eccceeecccccenccceeccceeeeeeceees 623 
16.1 Constructible Points and Constructible Real Numbers ..................... 623 
16.2 The Impossibility of Trisecting Angles.................. cece 639 

Exercises for Sections 16/1:and 1672: ..cn.0..00.neeliselad ovesetaseiGetadepe 643 

Chapter 17 Insolvability of the QUINEIC.........000ccceeecceeeneeceeneeeeceeeeceees 645 
17.1 Radical Extensions and Their Galois Groups .................ceeeeeeeeeeees 645 
17.2 A Proof of the Insolvability of the Quintic .................. eee 657 

Exercises for Sections: 17.1 and 17.2 cacasccisscauns secs cpetdsaseseeeestsanes 660 

17:3: Kronecker SU NCOTe i 64 5 xicy ats an sistas aaa ation ature ala eh taee al pale aaa Hane 663 
Exercises for Section 17.3 -i...i25asscada tases wveiniven celaeisaweenieegaes 678 
BiQlOOrGDHY wcccsovrvugadeadess vein Ue bedeviuse teeter adeverdeveringevensseewor ned 685 
WDE ass 3s seek eta bileidwide dd Din ean lis ba Rb eee Pod a ae dats ae Lae Nee eae eted ae 687 


Preface 


Abstract algebra, perhaps more than any other subject studied in college, has strong ties to the 
mathematics courses students have taken in high school. A course in abstract algebra can 
provide answers to many questions that are posed but not answered in high school 
mathematics courses. This is one reason that all mathematics majors, especially those hoping 
to teach at the high school or college level, can benefit from a course in abstract algebra. 


Many instructors have witnessed students who, despite having had success in courses up 
through multivariable calculus and linear algebra, struggle in abstract algebra. Some of these 
instructors wonder if abstract algebra should even be required for math majors with a 
secondary education concentration. While writing this book, I was keenly aware of these 
issues. 


This book was written because of my conviction that all mathematics majors should take 
abstract algebra, and, more importantly, all mathematics majors can learn abstract algebra. 
Some of the features that I believe will assist students in learning this subject are: 


1. Links to previous mathematics courses: This book uses abstract algebra to answer basic 
questions that arise in courses in algebra, geometry, trigonometry, precalculus, and 
calculus. Concepts in abstract algebra are introduced as the tools needed to solve these 
basic questions. 


2. Exercises: Courses up through multivariable calculus and linear algebra provide students 
with many exercises that allow them to practice and master new concepts. This book has 
1996 exercises, many of which give students lots of practice working with concrete 
examples of new concepts. For example, in Chapter 8, students will have many chances to 
look at a multiplication table of a group and then compute cyclic subgroups, cosets, and 
centralizers. 


At various points, the exercises may appear to be somewhat repetitive. This is deliberate. 

In many books, instructors find an interesting exercise and then are faced with the choice 

of whether to include it in the lecture or in the homework. Perhaps the exercise’s solution 
is in the solutions manual and the instructor prefers to assign problems where the solution 
is not readily available. Sometimes the exact opposite situation occurs. To avoid these 


xi 


Preface 


problems, this book often includes many similar-looking exercises. This also gives the 
student more chances to practice and master concepts than is typically found in abstract 
algebra texts. 


3. Examples before definitions: Students in abstract algebra courses are often overwhelmed 
or intimidated by the sheer volume of definitions and new objects. Whenever possible, this 
book attempts to provide examples before definitions so that definitions reflect the 
collecting of properties common to several concrete examples. For example, the integers, 
rational numbers, real numbers, and complex numbers are introduced before the 
definitions of commutative rings and fields are given. Similarly, concrete objects such as 
the invertible elements of the integers modulo n and the bijections of a set are studied 
before our formal discussion of groups. When new concepts are introduced, such as 
automorphisms in Chapter 5 and ring homomorphisms in Chapter 9, they are immediately 
applied to familiar problems such as finding roots of polynomials and determining when 
polynomials are irreducible. 


4. Fundamental Theorem of Algebra: Virtually every abstract algebra textbook mentions the 
Fundamental Theorem of Algebra, but very few contain a proof. The reason is that a 
primarily algebraic proof requires so many new ideas that it would take most books too far 
off course. However, in Chapter 6, we present a proof based on some familiar ideas from 
one and two variable calculus. We have chosen this direction for both philosophical and 
practical reasons. 


One of the goals of this book is to help students develop a deep understanding of the roots 
and factoring of polynomials over different number systems. Occasionally, this requires 
examining topics that are not traditionally part of an algebra course, such as the 
Intermediate Value Theorem and the Fundamental Theorem of Algebra. However, these 
topics are essential for an understanding of the differences in the behavior of polynomials 
over the rational numbers, the real numbers, and the complex numbers. 


As a practical matter, having the Fundamental Theorem of Algebra at our disposal makes 
it much easier to introduce Galois theory and then prove the insolvability of the quintic. 
Students often struggle with the level of abstraction in Galois theory. However, when 
examining the roots of polynomials with rational coefficients, the Fundamental Theorem 
of Algebra allows us to always work with fields that lie between the rational numbers and 
complex numbers. This more concrete approach makes the key ideas of Galois theory 
easier to understand and greatly simplifies the proof of the insolvability of the quintic. 


5. Theorems with proofs: With the occasional exception of results from courses below 
abstract algebra, if a theorem appears in this book, so will its proof. A philosophy 
underlying this book is that reading proofs is an essential part of abstract algebra. 
Sometimes textbooks will state powerful theorems, without proof, and then use them to 
obtain other important results. For example, abstract algebra books often state, without 


xii 


Preface 


proof, the Fundamental Theorem of Algebra or the Fundamental Theorem of Galois 
theory and then use them to prove other results. I believe this approach can stand in the 
way of students gaining a deep understanding and appreciation of algebra. 


There will be times in this book when the theorems we state, prove, and apply are not the most 
general results known. However, as opposed to applying stronger results whose proofs they 
have never seen, I believe students will learn more applying results whose proofs they have 
worked through. 


Please feel free to e-mail your thoughts, comments, and corrections to me at 
jbergen @depaul.edu. You can find a list of corrections at www.depaul.edu/~jbergen. 


Xili 


This page intentionally left blank 


A User’s Guide 


A yearlong course in abstract algebra can cover this entire book with sufficient time for a 
thorough treatment of each section. However, it can easily be adapted to courses that meet for 
only one quarter, one semester, or two quarters. For courses that run less than a year, the 
chapter summaries following should help instructors decide which sections to skip and how to 
sequence the sections that are covered. 


Chapter 1—This introductory chapter points out that many questions that arose and were left 
unanswered in a student’s previous courses in algebra, geometry, trigonometry, precalculus, 
and calculus can now be answered using abstract algebra. It previews many of the results that 
will be proven in this text, such as the insolvability of the quintic, the Fundamental Theorem 
of Algebra, the impossibility of trisecting angles, rational values of trigonometric functions, 
partial fraction decomposition, and multiple roots of polynomials. This chapter can either be 
covered in class or left as a reading assignment. It is not a prerequisite for any of the later 
chapters. 


Chapter 2—Sections 2.1, 2.2, and 2.3 begin by discussing the importance of both intuition 
and rigor in mathematics. They then focus on proofs by contradiction, the Well Ordering 
Principle, and Mathematical Induction. Throughout this book, it will be very important for 
your students to have a solid understanding of these sections. However, if your students are 
already adept at writing proofs, these sections can be left as a reading assignment. 


Section 2.4 introduces functions and binary operations. This section will be the foundation for 
much of the material in this book. To make our detailed examination of groups in Chapter 8 
more accessible to students, examples of groups will appear at various points before then. In 
particular, groups are briefly discussed in Section 2.4 when we look at injective, surjective, 
and bijective functions. 


Chapter 3—This chapter focuses on properties of the integers, such as prime numbers and the 
Euclidean Algorithm. The most important result in this chapter is the existence and uniqueness 
of prime factorization. Exercises 31-37, immediately after Section 3.3, might be particularly 
helpful to students who wonder why a concept as intuitive as unique factorization requires 
proof. The ideas presented in this chapter are used throughout this book. In particular, our 
discussion of polynomial rings in Chapter 12 follows the pattern set forth in this chapter. 


XV 


A User’s Guide 


Chapter 4—Sections 4.1 and 4.2 contain topics that are not required for later chapters. 
Section 4.1 examines rational numbers and the relationship between fractions and repeating 
decimals. Section 4.2 compares the rational numbers and the real numbers and focuses on the 
least upper bound property, the Intermediate Value Theorem, and roots of polynomials with 
real coefficients. Some instructors may choose to skip these sections, as they cover topics that 
rarely appear in abstract algebra courses. However, if a student has not seen these topics in 
previous courses, they have the opportunity to see them here. 


Equivalence relations and equivalence classes are introduced in Section 4.3. These topics will 
reappear many times throughout this book. Since students often struggle with quotient groups 
and quotient rings, many examples and exercises are provided that examine the addition and 
multiplication of equivalence classes and when these operations are well defined. 


Chapter 5—This chapter introduces the complex numbers and uses them, along with the 
integers, rational numbers, and real numbers, to motivate the definitions of commutative 

rings and fields. Complex conjugation and its relationship to roots of polynomials are then 
used to motivate the definitions of automorphisms and Galois groups. Chapters 8, 15, and 17 
contain a more detailed and theoretical treatment of groups and automorphisms. However, it is 
helpful for students to gain experience, at this stage, working with concrete examples of these 
objects. 


Chapter 6—One of the themes of Chapters 5 and 6 is to demystify the complex numbers and 
to show that they are as real as the real numbers. In Chapter 5, we show that the construction 
of the complex numbers from the real numbers is simpler and more straightforward than either 
the construction of the rational numbers from the integers or the real numbers from the rational 
numbers. In Sections 6.1, 6.2, and 6.3, polar form and DeMoivre’s Theorem are introduced 
and are used to help show that the addition and multiplication of complex numbers can be 
viewed in a very concrete and geometric manner. 


Section 6.4 contains a proof of the Fundamental Theorem of Algebra. This allows us to deal 
with fields, Galois groups, and the insolvability of the quintic more concretely in Chapters 15 
and 17, as we only need to work with fields that are contained in the complex numbers. 
Abstract algebra courses that do not run for a full year might need to omit Chapters 15 and 17. 
In this case, Section 6.4 can also be omitted. 


Chapter 7—Sections 7.1, 7.2, and 7.4 examine the integers modulo n and provide many 
examples of commutative rings, fields, and groups. The ideas in these sections are needed 
when we examine polynomials with integer and rational coefficients in Chapter 9 and also for 
the proof of Kronecker’s Theorem in Chapter 17. Section 7.3 looks at the Euler @ function and 
is not a prerequisite for any of the later chapters. 


Chapter 8—This chapter, which examines the structure of finite groups, can be covered in 
many different ways depending on how the instructor structures the course. Since students will 


xvi 


A User’s Guide 


have already worked with examples of groups in Chapters 2, 5, and 7, they should be well 
prepared for the more formal and detailed treatment in Chapter 8. If a course proceeds 
sequentially through this text, Sections 8.1 and 8.2 will be covered toward the end of the first 
semester. Therefore, even if students only take one semester of abstract algebra, they can still 
see a proof of Sylow’s Theorem. 


Sections 8.3 and 8.4 deal with solvable and symmetric groups and are only needed for 
Chapters 15 and 17. Since Chapter 8 is quite long, instructors may decide to take a short break 
from group theory after Section 8.2, as Section 8.3 can be covered at any point before 

Section 15.3 and Section 8.4 at any point before Chapter 17. 


If an abstract algebra course runs for only one quarter, one semester, or two quarters, the 
instructor may determine that the brief introduction to groups in Chapters 2, 5, and 7 is 
sufficient and then skip Chapter 8 entirely. This would allow time to cover some of the links 
between abstract algebra and the high school curriculum in Chapters 9, 11, and 13 that do not 
require group theory. 


Chapter 9—This chapter helps to illustrate the importance of ring homomorphisms and the 
integers modulo p by using them to prove the Rational Root Test, Gauss’ Lemma, and 
Eisenstein’s Criterion. Since this chapter examines the roots and irreducibility of polynomials 
over the integers, rationals, reals, and complex numbers, it should be particularly useful for 
students planning to teach algebra at the high school or community college level. 


Chapter 10—Section 10.1 shows how to find the roots of polynomials of degrees less than 5, 
and Section 10.2 informally discusses some consequences of Galois’ work. Section 10.1 can 
be covered at any point in the course, and Section 10.2 only requires an understanding of 
Eisenstein’s Criterion. The material in this chapter is not a prerequisite for any of the later 
chapters. 


Chapter 11—This chapter examines rational values of trigonometric functions and explains 
why the 30°—60°—90° and 45°—45°—90° triangles tend to be the only right triangles studied in 
trigonometry classes. The only background material needed for this chapter is Mathematical 
Induction and the Rational Root Test. This is another chapter that should be particularly useful 
for future teachers. The material in this chapter is not a prerequisite for any of the later 
chapters. 


Chapter 12—In this chapter, it is shown that polynomials over fields satisfy analogs of many 
properties satisfied by the integers. The proofs in Sections 12.1—12.4 are very similar to those 
in Chapter 3. In Section 12.5, the relationship between multiple roots of polynomials and 
derivatives is examined. The results in this chapter will be used repeatedly throughout the 
remainder of the book. 


xvii 


A User’s Guide 


Chapter 13—This chapter contains material that should be of particular interest to teachers of 
precalculus and calculus. In Section 13.1, difference functions are used to find the polynomial 
of smallest degree that can produce a collection of data. As an application, Section 13.2 shows 
how to derive many of the formulas that students merely verify when first learning about 
Mathematical Induction. Section 13.3 shows why the partial fraction decomposition algorithm 
in calculus courses actually works. This section relies heavily on the division algorithm and 
Euclidean Algorithm for polynomial rings in Chapter 12. The material in this chapter is not a 
prerequisite for any of the later chapters. 


Chapter 14—This chapter examines some of the key concepts in linear algebra: basis, 
dimension, spanning set, and linear independence. The material in Sections 14.1, 14.2, and 
14.3 is essential for the final three chapters of this book. However, instructors may choose to 
skip this chapter if the students have already taken a course in linear algebra. 


Chapter 15—Section 15.1 examines degrees of field extensions, and Sections 15.2 and 15.3 
look at splitting fields and Galois groups. The material in this chapter is the foundation for the 
work in Chapter 16 on ruler and compass constructions and in Chapter 17 on the insolvability 
of the quintic. If a course does not allow time for a proof of the insolvability of the quintic, 
instructors can go directly from Section 15.1 to Chapter 16 and can also skip Sections 8.3 

and 8.4. 


Chapter 16—This chapter contains the proof that angles cannot be trisected with ruler and 

compass. It relies very heavily on Section 15.1. Although this result appears near the end of 
the book, by carefully choosing which sections to skip, it can be covered in a one-semester 

course. The results in this chapter are not used in Chapter 17. 


Chapter 17—Sections 17.1 and 17.2 contain the proof of the insolvability of the quintic and 
also show how to produce infinite families of fifth- and seventh-degree polynomials that are 
not solvable by radicals. Section 17.3 contains additional material, such as Kronecker’s 
Theorem and the Isomorphism Theorem for Rings, that should be of particular interest for 
students planning to pursue graduate study. This section exploits one of the recurring themes 
of this book: the similarities between the integers and polynomials rings. 


XVili 


Acknowledgments 


I would like to thank the many people who supported me as my class notes became a book. 
First, my thanks to my DePaul colleagues Allan Berele and Stefan Catoiu for teaching from 
preliminary drafts and to Susanna Epp and Lynn Narasimhan for encouraging me to take on 
this project. Second, I thank Ken Price of the University of Wisconsin-Oshkosh for providing 
useful feedback after using a preliminary draft. Third, my thanks to Glenn Olson of Maine 
East High School for providing me with information about the connection between complex 
numbers and electrical circuits. 


I owe a debt of gratitude to Dan Tripamer of St. Viator High School for all his work producing 
the diagrams. Lauren Schultz Yuhasz of Elsevier has been enormously helpful, and the 
comments by the reviewers she found helped shape the final product. My thanks to Phil 
Bugeau of Elsevier for his help in the final stages of this project. I would also like to thank the 
University Research Council at DePaul University for their support. 


Finally, a special thank you to my wife Donna and children Renee, Sabrina, Mark, and Melisa 
for their continuous love and support. 


Jeffrey Bergen 
July 2009 


XIX 


This page intentionally left blank 


What This Book Is about and 
Who This Book Is for 


You are about to embark on a journey. Often this journey is referred to as abstract algebra. 
Others call it modern algebra, and still others simply call it algebra. But it is probably very 
different from any type of algebra you have ever studied before. 


When they are first introduced to this subject, many students feel quite intimidated. They feel 
as if they are drowning in an unending sea of meaningless definitions. Terms like group, ring, 
field, vector space, basis, dimension, homomorphism, isomorphism, and automorphism appeat, 
often for no apparent reason. 


Almost all of us, at some point, are intimidated by a new project. Many home repair projects 
have that effect on me. A walk through the aisles of a home improvement store can intimidate 
me to the point where it becomes difficult to even formulate an intelligent question for a sales 
clerk. The aisles and aisles of bizarre-looking devices and gadgets overwhelm me. However, 
every item is there for a reason. Each one is a tool needed to solve a problem. Suddenly one 
odd-looking device is exactly what I need to unclog my bathtub. Yet another is precisely what 
I need to make my vacuum cleaner work again. 


Abstract algebra is a subject that arose in an attempt to solve some very concrete problems. It 
is likely that you have already come across many of these problems in your previous courses 
as they occur very naturally in algebra, geometry, trigonometry, and calculus. However, in 
those courses, these problems are usually dismissed with the comment that they are beyond 
the scope of the course. 


It may seem like an odd analogy, but reading through a book in abstract algebra is not all that 
different from walking through the aisles of a home improvement store. All those intimidating 
new terms you come across in an abstract algebra book are actually tools. They are precisely 
the tools needed to finally solve many of the problems that arose but remained unsolved in 
your previous courses. 


In this book, you will be introduced to the basic terms, ideas, and concepts of abstract algebra. 
Each of these new ideas will be presented as concretely as possible. New terms and concepts 


Copyright © 2010 by Elsevier Inc. All rights reserved. 1 


2 Chapter 1 


will be introduced as the tools needed to solve well-known problems. Each time we come 
across a new abstract object, we will be equipped with both the knowledge of the problem it is 
being used to solve, as well as multiple concrete examples of the object. This should help 
eliminate the intimidating aspects of this subject and will allow us to understand and 
appreciate both the beauty and the importance of the subject. 


Let us now look at some of the problems that we will use abstract algebra to solve. We will list 
them according to the course where you may have first seen them. 


1.1 Algebra 
1.1.1 Finding Roots of Polynomials 


Long ago, you learned that in order to find the root of the polynomial 2x + 1, we first subtract 
1 from both sides of the equation 


2x+1=0 
to obtain the equation 
2x=-l, 


and then divide both sides by 2 to obtain the root 


xX=-n. 


2 


More generally, if a and b are real numbers, with a ¥ 0, then to find the root of the polynomial 
ax + b, we first subtract b from both sides of the equation 


ax+b=0 
to obtain the equation 
ax = —b, 


and then divide both sides by a to obtain the root 


b 
x=--. 
a 


Thus, we know how to find the root of any polynomial of degree 1. Moving on to polynomials 
of degree 2, any such polynomial can be written as 


ax’ +bx+c, 


What This Book Is about and Who This Book Is for 3 


where a, b, and c are real numbers, with a 4 0. In high school, we derived the quadratic 
formula that told us that the roots of ax? ++ bx +c are 


—b+V/b? —4ac 
x= ; 
2a 


Therefore, for polynomials of degrees | and 2, it was not too difficult to find formulas for their 
roots. Note that these formulas were expressions involving only the coefficients of the 
polynomials and that the coefficients were combined in various ways via addition, subtraction, 
multiplication, division, and taking square roots. The next natural step is to look for a 

formula for the roots of polynomials of degree 3. We would like to find a formula that once 
again involves the coefficients. However, we would expect that, at this point, we might not 
only need to take square roots but to take cube roots as well. 


More generally, our goal is to find formulas for the roots of polynomials of all possible degrees 
where these formulas involve only the coefficients and the coefficients are combined in 
various ways via addition, subtraction, multiplication, division, and taking roots. By taking 
roots, we mean square roots, cube roots, fourth roots, and so on. 


In Chapter 10, we will show that such formulas do indeed exist for polynomials of degrees 3 
and 4. The quadratic formula x = =2=¥2—4ac vbr dae is significantly more complicated than the 
formula x = —4 for the root of polynomials of degree 1. In light of this, it is not surprising 
that the formula for the roots of polynomials of degree 3 is significantly more complicated 
than the quadratic formula. Again, it is no surprise that the formula for the roots of 


polynomials of degree 4 is significantly more complicated than its predecessors. 


The logical next step is to move on to polynomials of degree 5. Unfortunately, if one tries to 
generalize or adapt the techniques used to find the roots of polynomials of degrees 1, 2, 3, 

and 4, nothing seems to work. There are two possible reasons why nothing seems to work for 
polynomials of degree 5. The first possible reason is that the formula is so complicated that we 
just haven’t hit upon the approach needed to find it. Since the formula for the roots of 
polynomials of degree 4 is so much more complicated than its predecessors, it is logical to 
assume that finding a formula for the roots of polynomials of degree 5 should be an extremely 
difficult task. However, there is another possible reason why we have been unsuccessful. 
Perhaps there is no formula for the roots of polynomials of degree 5. This seems to be a 
disturbing possibility. Not only would it be disappointing to not have a formula available for 
finding the roots of polynomials of degree 5, but we also need to ask ourselves how can one 
possibly prove that no such formula exists. After all, how do we prove that something can’t be 
done or doesn’t exist? 


In one of the greatest achievements in abstract algebra, it was shown by Galois that no formula 
exists for finding the roots of polynomials of degree 5. In fact, Galois showed that for any 
integer n > 5, there is no formula for finding the roots of polynomials of degree n. Once again, 


4 Chapter 1 


by a “formula” we mean an expression involving only the coefficients of the polynomial 
where the coefficients are combined in various ways via addition, subtraction, multiplication, 
division, and taking roots. This famous problem is known as the insolvability of the quintic. 
Its solution will require an enormous amount of mathematical machinery and appears in 
Chapter 17. The main tools needed to solve it will be group theory and Galois theory. In fact, 
many of the terms and concepts appearing in this book are included because they are the tools 
needed to solve this famous problem. 


Before leaving this particular topic, we must remember that there are other approaches to 
finding the roots of polynomials. Essentially, the insolvability of the quintic tells us that a 
purely algebraic approach comes up short in trying to find the roots of some polynomials. 
However, depending on the application you have in mind, you may not need a formula for 
the roots of a polynomial that involves various combinations of the coefficients. Instead, you 
may need the roots computed to a certain number of decimal places. There are many numerical 
algorithms that can give you the roots of polynomials to as many decimal places of accuracy 
as you desire (or at least as many decimal places as the machine you are using can handle). 
Many of these algorithms are built into or can be easily programmed into a graphing 
calculator. Although this does not technically give you the exact answer, having the answer 
correct to a large number of decimal places may well be sufficient for the application you 
have in mind. 


1.1.2 Existence of Roots of Polynomials 


As mentioned in the preceding paragraph, the phrase “finding a root” can have slightly 
different meanings, depending on the context. If we are looking for the largest root of the 
polynomial x*— 14x? + 9, then in an algebra course, you would probably write the answer in 
the form /2+ J5. (At this point, you should take a moment to check that pee 4/5 is indeed 
a root of x* — 14x” + 9.) However, depending on the application you had in mind, you might 
want the answer to 5 decimal places, and, in this case, 3.65028 would be your answer. If you 
wanted the answer to 10 decimal places, then 3.6502815398 would be the answer. On the other 
hand, in the unlikely event that you needed the answer to 32 decimal places, then the answer 
would be 


3.65028 153987288474521086239294097. 


Similarly, the phrase “existence of a root” can mean different things depending on the context. 
We begin by considering the polynomial x +5; if we restrict ourselves to dealing only with 
positive integers, then this polynomial has no roots. Once we expand our horizons to the set of 
integers, we see that this polynomial certainly has a root and the root is —5. In a similar vein, 
if we restrict ourselves to dealing only with integers, then the polynomial 2x — 7 has no roots. 
By once again expanding our horizons, this time to the set of rational numbers, then our 
polynomial certainly has a root and the root is i. 


What This Book Is about and Who This Book Is for 5 


At various points in this book, including Chapters 2 and 3, we will show that J/2 is not a 
rational number. Therefore, in order for the polynomial x” — 2 to have a root, we must look 
beyond the rational numbers. In calculus, one proves that there is indeed a positive real 
number whose square is 2. This is an issue that we will reexamine in Chapter 4. Therefore, 
by looking at yet another larger set of numbers—the real numbers—our polynomial has the 
roots +/2. At this point, we can begin to wonder if we must continually expand the set of 
numbers we are using in order to guarantee the existence of roots of all polynomials. 

After all, even the real numbers do not suffice, as they do not contain a root of the 
polynomial x? + 1. 


The complex numbers contain an element, denoted as i, with the property that i7 = —1. 
Therefore both i and —i are roots of the polynomial x* + 1. Every complex number can be 
written in the form a + bi, where a and b are real numbers. It turns out that the complex 
numbers are “big enough” that they contain the roots of all polynomials. More precisely, we 
mean that any polynomial of degree at least one, whose coefficients are real numbers, has a 
root in the complex numbers. Therefore the polynomials x +5, 2x —7, eo 2 and x + 1. 

as well as more complicated polynomials like x° — 6x +2 and mx? — 3x? 4+ J/7x— ace 

all have a root in the complex numbers. This beautiful and important result is known as the 
Fundamental Theorem of Algebra. Interestingly enough, proofs of this result relying almost 
entirely on algebra are extremely difficult, whereas fairly elementary proofs exist that use only 
some of the basic ideas of complex numbers and multivariable calculus. We will present one 
of these relatively elementary proofs in Chapter 6. 


Think about the old story of how hard it is to find a needle in a haystack. Imagine how much 
more difficult the situation would be if you weren’t entirely sure there even was a needle in the 
haystack. If you were unable to find the needle, you would never know if the problem was 
that you hadn’t searched well enough or that the needle wasn’t there in the first place. There is 
a clear parallel with trying to find the roots of polynomials. Finding roots can be a difficult 
task, but imagine how much more difficult it would be if you didn’t know whether a root was 
there to be found. But thanks to the Fundamental Theorem of Algebra, we are guaranteed 

that there will always be a root in the complex numbers. Therefore, although it may be difficult 
to find a root of a polynomial, we know there is always a root in the complex numbers 
waiting to be found. 


1.1.3 Solving Linear Equations 
Let us consider the following three similar-appearing systems of linear equations: 
(1) 2x4+5y=7 & 2x+3y=1, 
(II) 2x+5y=7 & 4x+10y=1, 
(I) 2x+5y=7 & 4x4+10y= 14. 


6 Chapter 1 


Despite looking somewhat similar, when we look at their solutions, we see that these systems 
of linear equations differ greatly from one another. Note that system (I) has the unique solution 
x = —4 and y = 3, whereas system (II) has no solutions, and system (III) has an infinite 
number of solutions. 


In your earlier algebra courses, you probably noticed that every system of linear equations had 
one solution, no solutions, or an infinite number of solutions. Perhaps you have wondered if 
this is always the case. Or is it possible for a system of linear equations to have exactly two 
solutions or three solutions or some other number of solutions? 


Chapter 14 will include an investigation of systems of linear equations. At that point, we will 
show that when dealing with familiar number systems like the rational numbers and real 
numbers, it is indeed the case that every system of linear equations has one solution, no 
solutions, or an infinite number of solutions. However, there are other types of number 
systems, which will be introduced to in Chapter 7, where there are other possibilities for the 
number of solutions. 


1.2 Geometry 
1.2.1 Ruler and Compass Constructions 


In a course in geometry, we construct various geometric objects using a ruler and a compass. 
One of the first constructions is to take a line segment and divide it into two equal pieces. It is 
not much harder to take a line segment and divide it into three, four, or any number of equal 
pieces. At that point, we might turn our attention to angles. Just as it was not difficult to bisect 
a line segment, it is also not hard to divide an angle into two equal angles. However, when we 
try to divide an angle into three equal angles, difficulties seem to arise. Certainly some angles 
can be trisected. For example, we can trisect a 90° angle and obtain a 30° angle. However, all 
attempts at finding a procedure that will work for all possible angles seem to fail. This raises 
the type of question we dealt with earlier when we discussed looking for a formula for the 
roots of polynomials of degree 5. Are we unable to find a technique for trisecting angles 
because we simply haven’t hit upon the right idea, or is it impossible to trisect angles with 
only a ruler and a compass? Once again, we are confronted with the difficult question of how 
to show that something is impossible. 


Using a ruler and compass, it is not difficult to construct equilateral triangles. Since all three 
angles of an equilateral triangle are equal, that means that we have succeeded in constructing a 
60° angle. If indeed it were possible to trisect all angles, then we could trisect our 60° angle to 
obtain a 20° angle. In Chapter 16, using tools on field extensions developed in Chapter 15, we 
will show that it is impossible to construct a 20° angle. Thus, it is indeed impossible to use a 
ruler and compass to trisect all possible angles. 


It is very common to think of positive real numbers as representing distances between points 
on a number line. We can therefore think of a positive real number a as being constructible, 


What This Book Is about and Who This Book Is for 7 


meaning that with a ruler and compass, we can construct a line segment whose length is a. 
What we will really be showing in Chapter 16 is that many, many real numbers such as 

23 ; 293, and 117 are not constructible. The key to proving that 20° angles cannot be 
constructed will be to first show that cos(20°) is not a constructible real number. 


1.3 Trigonometry 
1.3.1 Rational Values of Trigonometric Functions 


When we study trigonometry, we pay special attention to the 30°—60°—90° and 45°—45° 
—90° right triangles. There are two reasons for this. The first is that it is fairly easy to compute 
the values of the sine, cosine, and tangent functions for these two types of triangles. The 
second is that for these triangles, the values of some of the trigonometric functions are so 
nice—in particular, sin(30°) = cos(60°) = 5 and tan(45°) = 1. This raises some questions. 

In a course in trigonometry, why do we not examine other triangles that also produce 
particularly nice values of the trig functions? Why do we not look for angles 6), 62 such that 
sin(9,) = 5 or tan(62) = 5? We know that there exist acute angles 6;, 02 with these properties, 
so why do we not study them? 


In trigonometry courses, we tend to look at angles using radian measure. Recall that 

30° = 11/6, 45° = 17/4, and 60° = 27/3. Each of these angles is a rational number times zr. By 
using other geometric and trigonometric facts, such as the double-angle formula, we can 
compute the values of the trigonometric functions at several other angles that are a rational 
number times zr. In fact, in the exercises after Chapter 11, we will show how to compute the 
exact values of cos (3) and cos (Z). Although these two values of the cosine do not turn out to 
be rational, it certainly seems that in our trigonometry courses, we should be able to study 
additional angles that are a rational number times z such that the values of sine, cosine, or 


tangent are rational numbers. 


We know that there are rational multiples of z that can make the sine and cosine functions take 
on the rational values 0, +5, +1 and the tangent function take on the rational values 0, +1. 
However, it is a surprising fact that these are the only rational values of the sine, cosine, and 
tangent functions that can be obtained by plugging in an angle that is a rational number 

times zr. In Chapter 11, we will prove the somewhat stronger result that when plugging in 
angles that are a rational number times zr, the only values of the sine and cosine functions 
whose squares are rational belong to the set {0. aes 5 zoe eas, +1 r In addition, we will 

see in Chapter 11 that the only values of the tangent functions whose squares are rational 

that are obtained by plugging in angles that are a rational number times z belong to the 

set {o, +33, +1, +/3 f Observe that these are all values of the sine, cosine, and tangent 
functions that were examined in our previous courses in trigonometry. Thus, Chapter 11 will 
explain why, in our previous courses, we never examined any other values of the trigonometric 
functions that are rational or whose square is rational. 


8 Chapter 1 


1.4 Precalculus 
1.4.1 Recognizing Polynomials Using Data 


In courses in algebra, precalculus, and calculus, we are often presented with a table of values 
of a function and asked to determine if the function that produced those values was linear, 
quadratic, or some other type of polynomial. Therefore, it is natural to look for a way to 
recognize if these values were indeed produced by a polynomial. Going one step further, if the 
values were produced by a polynomial, can we find the polynomial? Shortly, we will need to 
be a bit more precise about what we are asking, but first let us look at an example. 


Xx: 1 4 7 10 13 16 19 22 25 
fa): 7 13 19 25 31 #37 43 49 55 


In the table, x continues to increase by 3 as we move from left to right. As you may recall, a 
function f(x) is linear precisely if a fixed change in x always results in a fixed change in f(x). 
To test whether f(x) is linear, we will introduce a new function f(1)(x), which we define as 
fay) = f(x+3) — f(x). So, for example, 


fay) = f) — f4)=19-13=6 and fay (22) = f(25) — f(22) = 55-49 =6. 


We can see that the function f(1)(x) measures the change in f(x) as x increases by 3. Next, we 
place the values of f(1)(x) alongside those of x and f(x) on our table. 


x: 1 4 7 10 13 16 19 22 25 
F(x): 7 13 #19 25 £431 #37 43 «649 = 55 
fi): 6 6 6 6 6 6 6 6 


Note that the table is left blank in one position, as we cannot compute f(1)(25) because 

fay (25) = f(28) — f(25), and we were not given the value of (28). Looking at the values of 

Fay (x), we see that f(1)(x) always gives us a value of 6. This tells us that whenever x increases 
by 3, then f(x) increases by 6. Thus, the values on the table can indeed be produced by a linear 
function. 


At this point we need to be a little careful about our wording. Note that we did not say that 
f(x) was a “linear” function. The reason is that we were only given 9 values of f(x). It is 
possible that if we were given additional values of f(x), then the new values of f(1)(x) might 
not continue to always be 6. 


It turns out that given any finite number of data points, there are an infinite number of 
polynomials that could have produced that data. However, if we are given exactly n data 
points, where v is an arbitrary positive integer, then there will always be exactly one 
polynomial of degree at most n — 1 that could produce that data. This generalized the 
fact—which we have seen in our previous algebra courses—that given two data points, there 
is only one linear function that could produce that data. These are facts that we will prove in 


What This Book Is about and Who This Book Is for 9 


Chapter 13. In light of this, our goal will be to find the polynomial of the smallest possible 
degree that could produce a collection of data points. In the preceding example, the data for 
f(x) is produced by a linear function, and using the point-slope formula (or various other 
techniques from your previous algebra courses), we see that the values on the table could be 
produced by the function 2x +5. 


Let us look at a second example. 


x: 1 2 3 4 5 6 7 8 9 
g(x): 2 7 14 23 34 47 #462 £79 98 


In this table, x is increasing by 1, so to see if g(x) could be linear, we look at the function 
81)(x) = g(x + 1) — g(x), which measures the change in g(x) as x increases by 1. Placing the 
values of g(1)(x) on the preceding table, we obtain 


x: 1 2 3 4 5 6 7 8 9 
g(x): 2 7 14 23 #34 47 #262 £79 98 
ena) 2° 7T 9 11 13. 15 17 19 


Since the values of g(1)(x) are not constant, the values of g(x) cannot be produced by a linear 
function. However, we can consider the function 9(1)(x) as measuring the “first differences” 
of g(x). We can now define a new function g(2)(x), as g(2)(x) = g(1)(x + 1) — gq) (x). Thus, we 
can consider 9(2)(x) as measuring the “second differences” of g(x). We will now place the 
values of g(2)(x) alongside those of g(x) and g(1)(x), noting that we do not have enough 
information to compute either g(2)(9) or g(2)(8). 


x: 12 3 4 5 6 7 8 9 
g(x): 2 7 14 23 34 #47 +62 «79 ~ 98 
sp@: 5 7 9 SW 13 15 17° 19 
gO. 2 2B 2 BS 2 DB 2 


We can see that the values of 9/2) (x), the “second differences” of g(x), are constant. It turns 
out that although the table of values of g(x) cannot be produced by a linear function, the 
values can be produced by a quadratic function. In fact, the function x? + 2x — 1 does the trick. 


Both of our examples are actually examples of a more general phenomenon. Suppose we are 
given a table of values for a function F(x), where, throughout the table, the change in x is the 
fixed number a. We can define a new collection of functions as follows: 


Fay(x) = Fxt+a)— F(x) and Fong) (x) = Feny(x +a) — Fay) (x), 


where n is any positive integer. We call the function F(,) (x) the nth difference function 
of F(x). Note that computing the various difference functions of F(x) is not unlike computing 
the higher derivatives of a function in calculus. In order to find the third derivative of a 


10 Chapter 1 


function, you must have already found its second derivative, which in turn requires having 
already computed the first derivative. Similarly, to compute F(3)(x), one must have already 
found F(2)(x), which means that you need to have already found F(1)(x). 


In Chapter 13, we will show that if the nth differences of F(x) are constant while the n — Ist 
differences are not constant, then the values of F(x) are produced by a unique polynomial of 
degree n. Observe that this is quite similar to a consequence of the Mean Value Theorem in 
calculus, which states that if the nth derivative of a function is constant and the n — Ist 
derivative is not constant, then the function must be a polynomial of degree n. We will also 
show how to find the polynomial of smallest possible degree that can produce a given table of 
values. 


When a student is first introduced to Mathematical Induction, they are usually asked to check 
the validity of formulas like 


(2n)(n+ 1)(2n+ 1) 
7 


Observe that whereas we are asked to verify such formulas, we do not address the more 
important and far more interesting question of how one goes about deriving formulas like the 
preceding one. In Chapter 13, we will apply our results on difference functions to show how 
to derive formulas like the preceding one. 


2° 447 +6? +---+(2n—2)* + (2n)? = 


1.5 Calculus 
1.5.1 Partial Fraction Decomposition 
Among the problems we confront in a calculus course are 


[e,2) 


1 1 
f= dx and Gay 


n=1 
The key to both problems is the fact that we can write 


i. dh, 
word). axe: ested 


Using this fact, the first problem becomes 


1 1 
ae eee eee eee eter 
ee . I; Sel ee 


Similarly, the second problem becomes 


)=(-3)+G-a)+G-a) roe 


[ee] 


1 ree | 
Ser 


1 
n+1 


What This Book Is about and Who This Book Is for 11 


The rational functions . and si are examples of a special type of rational function known as 
a partial fraction. Recall that there are two types of partial fractions. The first, and simpler 
type, consists of a real number in the numerator and a linear function raised to a positive 


integer in the denominator. Examples of the first type of partial fraction are 


3 JT a 
(2x — 5)?’ xo? (6x+3)" 


The second type of partial fraction consists of a real number or linear function in the 
numerator and an irreducible quadratic raised to a positive integer in the denominator. When 
we refer to an irreducible quadratic, we mean a quadratic polynomial that has no real roots. 
Therefore, in this context, we do not consider x” — 2 to be irreducible, since it has the real 
roots +./2 and can be factored as 


x2 = (x—V2) (x42). 


Examples of the second type of partial fraction are 


8 2x—3 a 
x27417  (2x24+9)3? (x2 4+2x4.73)15° 


In calculus courses, partial fractions are occasionally used to compute the sum of an infinite 
series, as in the preceding example. However, the primary use of partial fractions in calculus 
courses is to assist us in the integration of rational functions. In this context, when we say that 
we are integrating a function, what we really mean is that we are finding an antiderivative. 
Partial fractions of the first type are very easy to integrate. There is also a straightforward 
algorithm that can be used to integrate partial fractions of the second type, although the 
computations can get very, very messy when the exponent in the denominator is large. For 
example, solving { Sai dx is easy, and the solution is arctan(x) + C. On the other hand, to 
solve 


1 
—e 2 
| warae : 


we must first use the trig substitution x = tan(y). This then reduces the problem to solving 


/ cos*® (y) dy. 


This integral can be solved in a straightforward way, but the work is incredibly long and 
tedious and probably takes several pages. However, the bottom line is that all partial fractions 
can be integrated. 


12 Chapter 1 


In calculus, one states but does not prove that every rational function can be written as the sum 
of a polynomial and partial fractions. For example, 
2x4 +3x3 + 10x? + 11x — 13 3 4x—5 


—w) ———— = : 
@+)+4) ROS Sgt aoe 


Since we can certainly integrate any polynomial, the fact that we can also integrate any partial 
fraction now means that we can integrate any rational function. However, all of this is 
predicated on the fact that we can indeed decompose any rational function into the sum of a 
polynomial and partial fractions. This takes place, almost magically, in calculus by solving a 
system of linear equations. No explanation is given as to why this procedure always works. 

It turns out that an investigation of the greatest common divisors of polynomials holds the key 
and, in Chapter 13, we will show why partial fraction decomposition is always possible. 


1.5.2 Detecting Multiple Roots of Polynomials 


In calculus courses, one often uses the derivative to graph polynomials. Along the way, you 
usually look for the roots of both the original function and its first derivative. Perhaps you have 
noticed that the points where the original function has a multiple root are precisely the points 
where the function and derivative have a common root. For example, consider the polynomial 


f(x) = we —2+x= x(x — 1)’. 
In this case, 
f' (x) = 3x* —4x +1 = Gx-1)-1). 
Observe that 1 is a double root of f(x) and 1 is a root of both f(x) and f’(x). 


This is no coincidence, and in Chapter 12 we shall show that a root of a polynomial g(x) is a 
double root if and only if it is also a root of g’(x). We have already discussed how difficult it 
can be to find the roots of a polynomial. Given the difficulty in finding the roots of both a 
polynomial and its derivative, you might assume that it would still be quite difficult to test if a 
polynomial has multiple roots. However, we shall see in Chapter 12 that there is an easy 
algorithm for determining if a polynomial has multiple roots. This algorithm involves 
examining the greatest common divisor of a polynomial and its derivative, and we can apply 
this algorithm even if we have no idea what the roots of our polynomial are. One consequence 
of this algorithm is that if a polynomial and its derivative have no real roots in common, then 
the polynomial will have no multiple roots in the real numbers. Furthermore, if a polynomial 
and its derivative have no complex roots in common, then the polynomial will have no 
multiple roots in the complex numbers. 


Several times in this chapter we used the terms tools and machinery. These are words you are 
certainly familiar with, but they may appear out of place in a math book. Perhaps they will 


What This Book Is about and Who This Book Is for 13 


look less out of place after considering the following situation: You are given the job of 
clearing off a field after ten inches of snow has fallen. One approach you might take is to 
spend the time and money necessary to obtain a snowblower, whereas a second approach is to 
do the entire job using only a shovel. Notice that there are advantages to each approach. Some 
advantages of the first approach are that you would spend far less time out in the field and you 
would be much less tired after completing the job. On the other hand, with the second 
approach, you don’t need to spend either the time or the money obtaining a snowblower. In 
addition, you might enjoy working outside and might find it very fulfilling to do the job 
without the help of a machine. In many ways this is analogous to what can go on when 
attempting to solve a math problem. One approach is to put a good deal of time and effort into 
developing mathematical tools that can then be used to quickly solve the problem. An 
alternative approach is to try to solve the problem by doing lots and lots of calculations and 
computations that require hard work and patience but don’t require advanced mathematical 
ideas. As with the previous situation, there are advantages to each approach. The decision 
whether to buy the snowblower might be strongly influenced on how often there are heavy 
snowfalls. If it snows heavily nine or ten times per year, you are much more likely to buy 

a snowblower than if it only snows heavily once every five years. Applying the same type 

of thinking to math problems, we are much more likely to invest the time developing 
sophisticated mathematical tools if we suspect that these tools will be used repeatedly to solve 
problems. Throughout this book and throughout abstract algebra, a great deal of effort is 
invested in developing mathematical tools to solve problems. Like the snowblower that will be 
frequently used, these mathematical tools will be frequently used. Occasionally we will apply 
one of these tools to a problem that could be done without sophisticated mathematical tools. 
However, in the long run, we come out far ahead for having developed these tools. 


The title of this chapter consists of two parts. Hopefully, the preceding series of examples have 
answered the first part of the title: “What This Book Is about.” 


Teachers of mathematics at the high school and college levels will probably spend much of 
their career teaching courses in algebra, geometry, trigonometry, and calculus. As a result, this 
book is written, to a great extent, with teachers (both future and present teachers) in mind. An 
understanding of abstract algebra can help teachers better explain and appreciate many of the 
topics they teach. Thus, one of the primary goals of this book is to provide teachers with the 
understanding and appreciation of abstract algebra needed to make them better teachers. This 
book includes many topics not found in other books on abstract algebra, and they are included 
here because they relate directly to questions and problems that arise in courses in algebra, 
geometry, trigonometry, and calculus. As much as I would hope that all math majors, 
especially teachers, would study abstract algebra for an entire year, I understand that this is 
often not the case. If you will only be studying abstract algebra for one quarter, one 

semester, or two quarters, you will still find plenty of topics in this book of interest to 
teachers. 


14 Chapter 1 


Fortunately, many of you will study abstract algebra for an entire year. This will give you the 
opportunity to read the entire book and work through the more advanced material on group 
theory, field extensions, and Galois Theorem. In particular, you will see the proof of the 
insolvability of the quintic. The concrete approach throughout this book will give you the 
experience and confidence needed to master the more theoretical topics in Chapters 8, 15, 

and 17. Regardless of whether or not you plan to teach, this book will provide you with the 
background in abstract algebra required for courses leading to advanced degrees in either pure 
or applied mathematics. Hopefully, this also answers the second part of this chapter’s title: 
“Who This Book Is for.” 


Enjoy and learn well! 


Exercises for Chapter 1 


1. Let P(x) =x*- 1427 +9. 
(a) Find the roots of P(x) by (i) letting t = x’, (ii) using the quadratic formula to find ¢, 
and (iii) having found tf, solve for x. 


(b) Check, by plugging into P(x), that /2+ /5, /2—J/5, —/24+ V5, and 
—./2 — 4/5 are all roots of P(x). 


(c) The four roots of P(x) in parts (a) and (b) must be the same, yet they look quite 
different. For each of the four roots in part (a), find the root in part (b) that it is 
equal to. If necessary, use a calculator to examine decimal equivalents for the roots 
in parts (a) and (b). 


2. Let Gx) = x4 — 20x? + 16. 
(a) Find the roots of G(x) by (i) letting t = x’, (ii) using the quadratic formula to find f, 
and (iii) having found f, solve for x. 


(b) Check, by plugging into G(x), that /3 + J/7, /3 — /7, -V/3+ V7, and 
—/3 — 4/7 are all roots of P(x). 


(c) The four roots of G(x) in parts (a) and (b) must be the same, yet they look quite 
different. For each of the four roots in part (a), find the root in part (b) that it is 
equal to. If necessary, use a calculator to examine decimal equivalents for the roots 
in parts (a) and (b). 


3. Find the four roots of the polynomial H(x) = x*+ — 26x? + 81. If necessary, use the 
technique outlined in part (a) of exercises | and 2. 


For exercises 4—8, please first read the following: 


The work of Galois on the insolvability of the quintic tells us that for polynomials of degree 
> 5, there are no general formulas that will provide us with the roots of the polynomial. Again, 


What This Book Is about and Who This Book Is for 15 


in this context, by a formula we mean an expression that only involves the coefficients of the 
polynomial where the coefficients are combined in various ways via addition, subtraction, 
multiplication, division, and taking roots. 


However, there are some polynomials of degree > 5 where we can find the roots using purely 
algebraic techniques. To assist with exercises 4-6, you may want to refer to the formula 


(x+y =x? +5x4y4 10x? y? + 10x72 +5xyt+y. 


Find the roots of x° + 5x4 + 10x? + 10x? + 5x+33 and x° +5x4+ 10x? + 10x? +5x—20. 
Find the roots of x° — 10x* + 40x* — 80x? + 80x — 61. 

Find the roots of x° + 15x++ 90x? + 270x* + 405x + 360. 

Find the roots of x® + 10x3 +21. 


ca A oh 


Find the roots of (x? — 1)4 — 23. 


For exercises 9-11, you may assume the following: 


(a) Ifyou can construct angles with degree measure n and m, then you can also 
construct angles with degree measure n + m,n —m, and 5. 


(b) 30° and 45° angles can be constructed. 
(c) 20° angles cannot be constructed. 


9. Which of the following angles can be constructed: 10°, 15°, 40°, 75°, 95°, 105°? Explain 
your answers. 


10. It can be shown that 36° angles can be constructed. In light of this fact, which of the 
following angles can be constructed: 1°, 2°, 3°, 4°, 5°, 6°, 7°, 8°, 9°? Explain your 
answers. 


11. In light of your answers to exercise 10, determine those positive integers n for which n° 
angles can be constructed. 


12. In this exercise, you may need to use the formula 
cos(6; + 62) = cos(9;) cos(@2) — sin(9,) sin(62). 


(a) Find the exact value of cos(75°). 
(b) Show that cos(75°) is a root of the polynomial 16x* — 16x* +1. 


(c) Find the other three roots of 16x* — 16x? +1. 


16 


13. 


14. 


15. 


16. 


17. 


18. 


Chapter 1 


(d) For each of the other three roots of 16x* — 16x? + 1, find an angle such that the root 
is the value of the cosine function at that angle. 


Suppose @ is an angle such that cos(@) is a root of the polynomial x” + ax + 6, where 
a, 6 are real numbers. Use the trigonometric identity sin? (6) +cos*(9) = 1 to show that 
sin(@) is a root of the polynomial 


x +(e? —2p =2)x + (6° +28 4+1=—a"). 
Here is a table of some of the values of the function f(x). 


x: —23 -18 -13 -8 -3 2 7 12 17 
f(x): —51 -36 —-21 -6 9 24 39 54 69 


Explain why this table can be produced by a linear function, and then find the linear 
function that can produce this table. 


Here is a table of some of the values of the function g(x). 


x: -5 -3 -1 1 3.95 #7 9 11 
g(x): 89 31 -—3 -13 1 39 101 187 297 


(a) Explain why this table cannot be produced by a linear function but can be produced 
by a quadratic function. 


(b) Find the quadratic function that produces this table. 


Here is a table of some of the values of the function h(x). 


x: -ll1 —-7 —3 1 5 9 13 17 21 
h(x): —214 -102 -22 26 42 26 —22 -102 —214 


Find the polynomial of smallest possible degree that can produce this table. 
Here is a table of some of the values of the function k(x). 


x: —4 —3 —2 -1 0 1 2. 3 4 

k(x): —3413. —987 —207 -35 -9 -3 13 213 1155 
Find the smallest positive integer n such that this table can be produced by a polynomial 
of degree n. (You do not need to find the polynomial, merely its degree.) 
Find real numbers a and b such that 


QF 3 lB 
x2+11x+28 x47 2x44 


What This Book Is about and Who This Book Is for 17 


19. Find real numbers a, b, and c such that 


Vix? = 2x44 _a bx+c 
we+x x x41" 


20. When decomposing a rational function into a sum of partial fractions, all the denominators 


21. 


22. 


we use are powers of polynomials that cannot be factored any further. A similar type of 
decomposition can be done for rational numbers, as we can decompose them into sums of 
fractions such that all the denominators are powers of prime numbers. To illustrate this, 
find integers A and B such that 


20 4" 
As was the case with partial fractions, one should begin by multiplying both sides of the 
equation by the denominator of the left-hand side of the equation. Note that when solving 
for A and B, many different answers are possible. 


43 A B 
5" 


Let F(x) = 12x37 —8x?—17x—S. 
(a) Find F’(x), and then find all the roots of F’(x). 


(b) Check to see if any of the roots of F’(x) are also roots of F(x). 


(c) Use your results from parts (a) and (b) to write both F(x) and F’(x) as products of 
linear polynomials. 


Let G(x) = x° — 8x —23. 
(a) Find G’(x). 


(b) Show that any real number that is a root of G’(x) cannot possibly also be a 
root of G(x). 


(c) Does G(x) have any multiple roots in the real numbers? Explain your answer. 


This page intentionally left blank 


Proof and Intuition 


Think about the effect a good novel can have on you. It can move you to laughter or tears. It 
can terrify you to the point where you feel your heart racing. Pity the poor reader who 
considers a novel to be nothing more than a collection of letters and punctuation marks that 
obey various rules of grammar and spelling. 


Similarly, consider the effect a symphony can have on you. It can evoke emotions and feelings 
that were buried inside you for years. If your friends felt that a symphony was merely a long 
sequence of notes and tones, wouldn’t you feel sorry for them? 


In order to transcribe or reproduce a novel or piece of music, it sometimes must be reduced to 
a sequence of symbols on a piece of paper. However, a novel or symphony is so much more 
than a collection of symbols. In the same way, a mathematical proof is much more than a 
collection of symbols that obey various rules of syntax and logic. A proof should evoke an 
appreciation and understanding of the subject. When reading a mathematical proof, one needs 
to internalize what it is saying. Sadly, many people view mathematics as merely the formal 
manipulation of symbols. Even worse, they believe that the rules that dictate the manipulation 
of the symbols are arbitrary or random. 


We need to ask ourselves, where do mathematical proofs come from? When trying to prove 
something in mathematics, you must first spend time developing an understanding of the 
problem at hand. You need to gain experience by working with special cases and examples. 
You need to experiment with all types of possible solutions. This requires using your 
imagination. When you begin, you never know what ideas will eventually be needed. Even if 
your first 20 ideas do not lead to a solution, perhaps the 21st or 31st or 51st idea you have will 
lead to a solution. A combination of perseverance and imagination will be required. 


Suppose after minutes or hours or days of work, you succeed in finding the proof that you 
were looking for. When you prepare a written version of your proof, you no longer record all 
the false attempts or ideas that didn’t work. You only write down what did work. Whoever 
reads your proof will probably never be aware of all the different thoughts you had while 
trying to solve the problem. They will never be aware that the idea that ultimately solved the 
problem may have been a combination of bits and pieces of ideas from seven or eight failed 
attempts at a solution. In reality, the path to a solution is rarely a direct route. But this is not 


Copyright © 2010 by Elsevier Inc. All rights reserved. 1 9 


20 Chapter 2 


what will come across to your reader. They will only see a direct path to the solution. They 
will not be aware of all the hard work and false starts. Much of the intuition and imagination 
needed to solve the problem probably never comes across to your reader. The proof your 
reader will read is likely to be relatively short and extremely formal. 


At certain points, formality and precision are crucial in the writing of mathematics. When one 
transcribes a novel or piece of music for future generations, there is no place for sloppiness. 
Every word or note must be reproduced correctly; otherwise, we are no longer reflecting the 
true intentions of the composer or author. Similarly, there are places in mathematical proofs 
where there is no room for sloppiness. A mathematical proof should be written in such a way 
that it is convincing to an informed, skeptical reader. The reader must have an understanding 
of the terms and concepts being discussed. In addition, they should question every step of the 
proof. No statement should be accepted as true unless it has been logically demonstrated 
beyond a shadow of a doubt. In light of this, there is no place in a mathematical proof for 
fuzziness or ambiguities. There can be no loopholes. 


Unfortunately, the need for formality and rigor in the writing of proofs often masks the ideas 
that ultimately led to the proof. Throughout this book, you will often be required to either read 
or write proofs. In doing so, you will be forced to write in a very formal and rigorous way. The 
arguments you make must be airtight. But never forget that the proofs you write are actually 
the by-product of hard work, experimentation, and imagination. It is creativity that drives new 
discoveries in mathematics. Although the proofs you read and write may be very formal 
objects, try to look for the creativity and ideas that motivated those proofs. When reading a 
proof, you will justify statements on a line-by-line basis. But when you are done, go back and 
try to find the main ideas in the proof. Try to recognize what is special about the proof and 
how it rises above and beyond the mere manipulation of symbols. 


In addition to the many proofs that will appear in this book, we will also include many 
paragraphs denoted as “Intuition.” The goal of these additional paragraphs is to help you see 
and understand the ideas and intuition that ultimately led to the proofs. In mathematics, we 
must always try to see the forest for the trees. By necessity, texts in advanced mathematics are 
filled with formal proofs. But hopefully, you will always remain aware that proofs are merely a 
record of discoveries that would be impossible without imagination, experimentation, and 
perseverance. 


2.1 The Well Ordering Principle 


Let us begin this section by looking at two examples that illustrate the importance of both 
rigor and imagination in mathematics. 


Problem 2.1: Trisecting Angles with a Ruler and Compass 


Proof and Intuition 21 


As mentioned in Chapter 1, we will prove in Chapter 16 that there is no algorithm for 
trisecting angles with a ruler and compass. In particular, we will show that 60° angles cannot 
be trisected. However, let us consider the following procedure, where we will choose one of 
the markings on our ruler and make a special note of it. We will refer to the point on the ruler 
with this marking as M. 


Step 1 Begin with an acute angle, and let O denote its vertex. 


Step 2 Place the ruler along one edge of the angle while placing the beginning of the ruler at 
O. Let A denote the point on the edge where the marking M lands. 


Step 3 Use the ruler to extend the line segment OA past the point A on one side and past the 
point O on the other. 


Step 4 Use the compass to draw a circle with center O such that A lies on the circle. 
Step 5 Let B denote the intersection of the circle with the second edge of the angle. 


Step 6 Place the beginning of the ruler at point O such that the marking MM lies on the point 
B. Then slide the beginning of the ruler along the line OA away from the point A. 
While doing so, make sure that the ruler continues to pass through point B. Continue 
doing this until the marking M lands on the circle. Now let C denote the point on the 
line OA where the beginning of the ruler is and let D denote the point on the circle 
where the marking M is. 


We will now compare ZOCD and ZAOB. If we draw the line connecting points O and D, it is 
clear that line segments OB, OD, and CD all have the same length. Therefore, triangles 
AOCD and AOBD are isosceles. We will let the symbol mZ denote the measure of an angle, 
and, for convenience, we let x = mZOCD. Since mZOCD =mZDOC and mZODB is an 
exterior angle of AOCD, it follows that 


mZODB=mZOCD+mZDOC = 2x. 


22 Chapter 2 


However, since mZODB = mZOBD, it also follows that mZOBD = 2x. But ZAOB is an 
exterior angle of AOCB, so 


mZAOB=mZOCD+mZOBD=x+2x = 3x =3(mZOCD). 


As aresult, we see that mZ OCD is one-third of mZAOB, and so we have trisected ZA OB. 


The preceding algorithm enables us to trisect any acute angle. But we had previously said that 
it is impossible to trisect 60° angles. How is this possible? How can we resolve this apparent 
contradiction? The problem is that the mathematical world is very imprecise when using the 
phrase “ruler and compass.” The preceding algorithm indicates that you really can trisect 
angles using a ruler and compass. However, when mathematicians use the word ruler in the 
phrase “ruler and compass constructions,” they are really referring to a ruler that has no 
markings on it. If mathematicians were being rigorous and precise, they would not call a ruler 
without markings a ruler. In fact, a ruler without markings is commonly called a straightedge. 
As a result, what we will really be proving in Chapter 16 is that it is impossible to trisect 
angles with only a straightedge and compass. Note that the key step in our trisection algorithm 
was Step 6, which makes strong use of the marking. As you can see, the imprecise use of the 
word ruler resulted in a great deal of confusion as to whether or not angles can be trisected 
with a ruler and compass. This example points out how rigorous and precise we must be when 
using mathematical terminology. If not, all types of apparent contradictions, like the preceding 
one, can occur. 


Problem 2.2: Moving Knights around an Altered Chessboard 


A famous problem asserts that you can place a knight on a standard 8 x 8 chessboard and move 
it around so it lands on every other square on the board exactly once. We will not prove that, as 
we will be looking at a slightly different problem. But before going any further, we should 
review the moves that a knight is permitted to make on a chessboard. A knight must move two 
spaces vertically (up or down) followed by one space horizontally (left or right) or one space 
vertically followed by two spaces horizontally. The following diagram denotes with the letter 
B all the possible places a knight could go with its next move if it was at position A. Note that 
if a knight was too close to one or more of the edges of the board, then it would no longer have 
eight places to go on its next move. 


Proof and Intuition 23 


We can now pose our problem. Suppose you removed the bottom/left square as well as the 
top/right square. This would leave us with an altered chessboard with only 62 squares. Is it still 
possible to place the knight on the board and move it around so it hits each of the remaining 61 
squares exactly once? Before reading further, you should take some time to play with and 
think about this problem. 


If you play with this problem long enough, you may begin to suspect that it is impossible to 
move the knight to each square exactly once. But how would we prove such a thing? After all, 
the inability to solve a problem hardly proves that it cannot be done. In order to prove that it is 
impossible for the knight to move in the desired way, some type of idea is needed. This is 
where the creative and imaginative aspects of mathematics come into play. The idea that will 
ultimately solve this problem is an incredibly easy one, but it will illustrate that mathematics is 
much more than formally manipulating symbols. 


Recall that every square on a chessboard is colored either red or black. Furthermore, every 
square has the opposite color from the squares that share a side with it. We can now observe 
that the bottom/left and top/right squares actually have the same color. Therefore, when we 
remove these two squares, we are left with 32 squares of one color and 30 squares of the other 
color. Now think about how a knight moves around a chessboard. As it moves, the color of the 
square it is on alternates. If it started on a red square, then it moves to a black square, then a 
red square, and so on. Similarly, if it started on a black square, then it moves to a red square, 
then a black square, and so on. That means that if it were possible to place a knight on one 
square and to move it so that it touched every other square exactly once, then it would touch 
31 red squares and 31 black squares. However, our altered chessboard has 32 squares of one 
color and 30 of the other. So it is impossible for the knight to touch every square exactly once. 


The preceding solution is fairly simple once we had the idea to think about the colors of the 
remaining 62 squares and how a knight alternates colors as it moves. But without the 
imagination to think about the colors on the chessboard, it is not obvious how in the world we 
could possibly solve this problem. In light of this, we see that the solution to a mathematics 
problem is often the result of a creative or imaginative idea and not the result of boring and 
formal symbol manipulations. 


24 Chapter 2 


In mathematics, statements are always either true or false but never both. Some examples of 
typical mathematical statements are 


1. An even integer plus an odd integer is always odd. 
An odd integer times an odd integer is always even. 


Every even integer can be written as the sum of two odd integers. 


2 
3 
4. Every even integer greater than 2 can be written as the sum of two prime numbers. 
5. Every finite group with an odd number of elements is solvable. 

6 


Every finite group arises as the Galois group of a finite field extension of the rational 
numbers. 


Some of the preceding statements are obviously true, and some are obviously false. For some 
of the others, we may not yet understand what they are talking about. In fact, for statements 4 
and 6, mathematicians have been unable to determine whether or not these statements are true. 
However, our inability to determine whether or not a mathematical statement is true does not 
change the fact that it must be true or false but cannot be both. 


Throughout this book, we will be proving whether various mathematical statements are true or 
false. As Problems 2.1 and 2.2 indicated, finding and writing proofs will involve not only 
formality and rigor but also creativity and imagination. The first algebraic object we will study 
in this course is the positive integers, which is the infinite set {1, 2,3, 4,5, 6,...}. The 
positive integers are frequently called the natural numbers and are abbreviated by the symbol 
N. Interestingly enough, proofs about more abstract objects like vector spaces, rings, and 
groups will frequently rely on basic facts about the positive integers. Such proofs often use a 
fundamental property of the positive integers known as the Well Ordering Principle. 


There are various ways to look at the Well Ordering Principle. Suppose you start by listing the 
days of your life 


{1,2,3,...,500, 501,502, ..., 2205, 2206, 2207,...}. 


Next, if you have ever gone to a museum, place a line over each day of your life that you went 
to a museum. Your list would then look something like 


near errors Core) 1167, 1168 5, css 213 1, 2192, 9.13), 5..%,259672597, 2508. o.4b 


If the preceding list was correct, then it indicates that the first time you went to a museum was 
on day number 1167 of your life. It seems quite clear that if you have ever gone to a museum, 
then there must have been a first time. Similarly, if you listed the days of your life and placed 


Proof and Intuition 25 


bars over each day that you attended school, then there must have been a first day that you 
attended school. Viewing these examples in a more mathematical mode, the list of the days of 
your life can be viewed as a listing of the positive integers N. The days that you went to a 
museum can be viewed as a subset S of N. The fact that there was a day on which you first 
went to a museum, provided that you have ever been to a museum, means that the set S has a 
smallest element. Similarly, the fact that there was a first day that you attended school, 
provided that at some point you have gone to school, indicates that the set of days that you 
attended school also has a smallest element. This leads us to the formal statement of the Well 
Ordering Principle. 


The Well Ordering Principle. Every nonempty subset S of the positive integers N has a 
smallest element. 


Note that if you have never been to a museum, then the set S of days on which you went to a 
museum would be the empty set. Therefore, S would not have a smallest element, as it has no 
elements at all. Thus, in order to guarantee that a subset of N has a smallest element, all we 
need to know is that it is not empty. 


It is likely that you have used the Well Ordering Principle in the past in subtle ways and have 
not even realized that you were using it. For example, let us consider the following: 


Proposition 2.3. Every rational number can be written in lowest terms. 


Intuition. Suppose we start with a fraction like Gn. 


easily find common factors of the numerator and denominator. In particular, 2 is a common 
factor, and after dividing the numerator and denominator by 2, we see that n= oe Once 
again, 2 is acommon factor and we reduce further to oe = a. There are still common 
factors. In particular, 2 is still a common factor, but we are free to divide the numerator and 
denominator by any common factor. Since 3 is also a common factor, we can divide the 
numerator and denominator by 3 to obtain iB = a The argument we usually give to 
convince ourselves that we can write our fraction in lowest terms is that we can continue to 
divide the numerator and denominator by common factors until there are no common 


factors left. 


It is not yet in lowest terms, as we can 


But how do we know that this procedure ever comes to an end? As we continue to divide the 
numerator and denominator by common factors, what guarantee is there that we will 
ultimately end up with a fraction in lowest terms? Since it seems obvious that this procedure 
comes to an end, we usually don’t go any further in proving it. However, when we say that 
this procedure eventually comes to an end, we are subtly using the Well Ordering Principle, 
even if we don’t realize it. It is now time to see how the Well Ordering Principle is used in a 
very precise and explicit way to solve this problem. 


26 Chapter 2 


If we go back and reexamine our original fraction a observe that we can rewrite this 
fraction using many different denominators. Our previous calculations show that we can 
rewrite i in equivalent forms with 504, 252, and 84 as denominators. However, the Well 
Ordering Principle guarantees that there is a smallest denominator that can be used in 
rewriting ae Even if we do not care to find or compute the smallest possible denominator, 
the Well Ordering Principle guarantees that there is one. But notice: Once you have found this 
smallest possible denominator, it must give you a fraction in lowest terms. For if there still 
remained a common factor of the numerator and denominator that was greater than 1, then 
dividing the numerator and denominator by that factor would result in an equivalent fraction 
with a smaller denominator, which is an impossibility. Observe that for the special case of the 
792 


fraction in: the smallest denominator that works is 14, and when we write 008 = ia We 


have written it in lowest terms. 


Proof. If q is a rational number, let 
= {b € N| there exist some integer a such that gq = at 


Since q is rational, it can be written as a quotient of integers; therefore, S is a subset of N that 
is not empty. Thus, S contains a smallest element, which we can denote as d. Since we can 
write g as a fraction with d as the denominator, there is some integer c such that g = § 


We contend that the fraction A is in lowest terms. If not, then there is a common factor e of 
both c and d such that e > 1. Since e is a factor of both c and d, we can write 


c=c'-eandd=d'-e 


where c’ and d’ are integers and d > d’ > 1. As a result, we now have 


c c-e c¢ 


Ta de a 
However, we have now written g as a fraction with a denominator d’ that is less than d. This is 
impossible, so it cannot possibly be the case that c and d have a common factor that exceeds 1. 
Therefore, the fraction 5 does indeed express g in lowest terms. 


2.2 Proof by Contradiction 


There is an important technique of proof that is very common in all of mathematics, especially 
in algebra, which is often used along with the Well Ordering Principle and is known as proof 
by contradiction. As we commented earlier, every statement in mathematics must be either 
true or false but cannot be both. That means that if we are trying to prove that a mathematical 


Proof and Intuition 27 


statement is true, then it is good enough to show that it cannot possibly be false. There will be 
many times where it will be difficult to prove in a direct manner that a certain statement is 
true, but it will be much easier to show that it cannot be false. In some sense, we are taking 
an indirect route in proving our original statement to be true, so one often refers to a proof by 
contradiction as an indirect proof. 


This then raises the important question of how we can prove that a statement cannot be false. 
To do that we consider what logical consequences would follow if the statement were indeed 
false. Suppose that one of the logical consequences of our statement being false is the truth of 
a new statement that we know cannot possibly be true because it contradicts other statements 
that we already know to be true. In order for this contradiction to have arisen, somewhere in 
our mathematical argument there must have been a mistake or faulty piece of logic. However, 
if we were careful and made no mathematical mistakes or logical errors, then the only thing 
that could have caused our contradiction was the presumption that our original statement was 
false. Since this tells us that it is impossible for our statement to be false, then our statement 
must be true. 


There are some disadvantages or dangers in using a proof by contradiction. Let us ask 
ourselves, when doing a proof by contradiction, how can we tell when we are “done”? The 
answer is we are done when we reach a contradiction. However, if you have made even the 
slightest computational or logical error, then that could be the real reason why you arrived at a 
contradiction. Therefore, if we make a mistake while doing the proof, we will incorrectly think 
that we have proven something when we really haven’t. That is the danger in using proofs by 
contradiction. 


It goes without saying that the main purpose of a proof in mathematics is to prove something. 
However, it is always preferable when a proof also adds to our intuition and understanding. 
It would be nice if every time we read a proof, we also came away with an understanding of 
what the key ideas are that were behind the proof. In a proof by contradiction, the proof 
proceeds until we reach a contradiction, and this type of reasoning often camouflages the key 
ideas behind the proof. This is occasionally a disadvantage of using a proof by contradiction. 
On the other hand, if using a proof by contradiction is the only way (or the easiest way) to 
prove something, we would be foolish to not use one. 


The proof of the next proposition is an example of a proof that combines both the Well 
Ordering Principle and proof by contradiction. On the plus side, this proof shows the power 
and usefulness of the Well Ordering Principle. On the negative side, it is an example of a proof 
where the key ideas are, unfortunately, camouflaged, and the proof adds little to our intuition 
or understanding. 


Proposition 2.4. [fn is a positive integer, then ./n is either a whole number or is irrational. 


28 Chapter 2 


After reading a mathematical statement, one should first get a thorough understanding of what 
it really means before reading the proof. So let’s first think about what Proposition 2.4 says. 
We are to consider the numbers 


11, 72, 73, 7/4, V5, ...,/99, 7100, /101,.... 


Some of them, like /1, /4, /9, //16,..., are obviously whole numbers. The proposition 
then asserts that all the other numbers on the list 


J2, 73,75, V6, V7, V8, V10,...,V99, V101,... 


must be irrational. 


Proof. Let n be a positive integer such that ./n is not a whole number; we must show that ./n 
is irrational. We will proceed using a proof by contradiction. Therefore, we will assume that 
,/n is a rational number but is still not a whole number, and we will arrive at a contradiction. 
Since ,/n is not a whole number, it must lie between two consecutive positive integers. 
Therefore, there exists a positive integer a such that 


a<J/n<a+tl. 
Next, we let 
S = {b€N|b\nisan integer}. 


Since ,/n is rational, the set S is not empty, and the Well Ordering Principle asserts that S 
contains a smallest element t. Thus, t¢ is the smallest positive integer that when multiplied by 
./n, results in an integer. 


Subtracting a from the inequalities a < ./n < a+1 results in 


0<J/n-a<l. 


Multiplying these inequalities by t¢ gives us 
0 <t(./n—a) <t. 


Since t,/n is an integer, it now follows from the preceding inequalities that t(,/n — a) is a 
positive integer that is now less than t. We already know that tf is the smallest positive integer 
that when multiplied by \/n gives us an integer. Since t(./n — a) is a positive integer that is 
less than f, it cannot give us an integer when multiplied by ./n. However, 


(t(./n—a))Jn = ti /n)? — ta/n = in —a(ty/n), 


which is indeed an integer. Therefore, we have reached a contradiction, and this concludes the 
proof. 


Proof and Intuition 29 


In the final exercise following Section 2.3, you will be asked to generalize the argument used 
in the previous proof and show that if a and n are positive integers such that a!/” is not a 
whole number, then a!/” must not be rational. We will also prove this in Chapter 3, and you 
might wonder why we chose to prove this in two consecutive chapters. In this section, our goal 
is to illustrate that the ability to use the Well Ordering Principle in a proof by contradiction is 
an enormously useful and powerful mathematical skill. However, some of the proofs that use 
this skill do not provide much intuition or a deep understanding of what is really going on. In 
other words, if a goal of a mathematical proof is to be illustrative and enlightening, then the 


shortest or most clever proof might not be the best proof. 


On the other hand, the proofs in Chapter 3 might be longer and require the development of 
more mathematical tools, but they will be more intuitive and will reveal more about the 
structure of the positive integers. In particular, these proofs will be based on some fascinating 
properties that are shared by both integers and polynomials. An important and recurring theme 
throughout this book, and throughout abstract algebra, is the strong similarity between the 
properties of integers and polynomials. By developing an understanding of these properties of 
the integers in Chapter 3, we will be providing a blueprint for our study of polynomials in 
Chapters 9, 12, and 17. 


At this point, we also need to be careful when saying that we have proven that numbers like 
2'/2, 7'/2 and 21!/° are irrational. The proof of Proposition 2.4 does not technically show that 
2'/? is irrational, but it shows that there is no rational number whose square is 2. Similarly, it 
does not technically show that 7!/? is irrational, but it shows that there is no rational number 
whose square is 7. Along the same lines, the final exercise after Section 2.3 deals with 
numbers of the form a!/”. It appears to imply that 21!/° is irrational, but it really implies that 
there is no rational number whose fifth power is 21. It might seem like we are quibbling over a 
minor point. However, in order to prove that a number is irrational, it is not enough to prove 
that it is not rational. Recall that irrational numbers are those real numbers that are not 
rational. Therefore, in order to prove that 2'/2, 71/2, and 21!/9 are irrational, it is not enough to 
prove that they are not rational. We must also prove that they are real numbers. Thus, before 
saying that 2'/*, 7!/2, and 21!/> are irrational, we must verify that there is a positive real 
number whose square is 2, a positive real number whose square is 7, and a positive real 
number whose fifth power is 21. All of these issues will be dealt with in Chapter 4 in our 
discussion of the rational numbers and real numbers. 


2.3 Mathematical Induction 


The Well Ordering Principle seems to lend itself to use in proofs by contradiction. It is 
reasonable to wonder if it is possible to restate the Well Ordering Principle in a way that would 
allow us to prove things directly (and hopefully more intuitively) and occasionally avoid 
proofs by contradiction. To this end, let us a consider a set T that has the following two 


30 Chapter 2 


properties: (a) T contains the number 1, and (b) whenever T contains a number k, it also 
contains the number k + 1. What can we say about the set 7’? For example, does T contain the 
number 3 or 15 or 1951? 


We know from property (a) that T contains 1. At first we didn’t know if T contained 2, but 
now that we know that T contains 1, property (b) tells us that T must contain 1+ 1 = 2. (Think 
about this!) Similarly, we didn’t originally know that T contains 3, but since we showed that T 
contains 2, property (b) tells us that T must also contain 2+ 1 = 3. Now that we have the ball 
rolling, it looks like we can continue to apply property (b). Since T contains 3, it must now 
contain 3+ 1 = 4. But therefore it must also contain 5, and so on. As a result, it certainly looks 
like T must contain all the positive integers. Unfortunately, the preceding argument is not a 
complete proof, as the phase “and so on” is not sufficiently clear or rigorous. However, once 
again the Well Ordering Principle can be used to finish the proof. The new principle that we 
are obtaining from the Well Ordering Principle is known as Mathematical Induction, and it can 
be stated several ways. 


Mathematical Induction—First Version. Let T be a subset of N satisfying the following 
two properties: 


(a) T contains 1. 
(b) Whenever T contains a number k, then it also contains the number k +1. 


Then T contains all positive integers. 


Proof. Once again, we will proceed with a proof by contradiction. Therefore, we will suppose 
that there are positive integers that are not in JT. Now let 


S={né€N|T does not contain n}, 


so S consists of those positive integers that are not in T. Since we are assuming that there are 
positive integers not in 7’; we see that S is not empty, and the Well Ordering Principle 
guarantees that S contains a smallest element, which we will denote as m. Using property (a), 
we know that T contains 1, so S does not contain 1. As aresult m 4 1, so it must be the case 
that m > 1. Now consider m — 1. Since m > 1, we see that m — 1 > 1. Thus, m — 1 is a positive 
integer that is less than m. Since m is the smallest positive integer in S, it must follow that 
m—1is notin S. But this tells us that m— 1 must be in 7. We can now apply property (b). 
Since m — 1 belongs to T, property (b) asserts that T contains (m — 1) + 1. However, 

m = (m—1)+1, so T contains m. But this contradicts the fact that S contains m. Having 


arrived at a contradiction, the proof is complete. 


There are countless statements about the positive integers that can be proved using either the 
Well Ordering Principle or Mathematical Induction. The choice of which to use is up to you. 


Proof and Intuition 31 


The difference is that proofs using the Well Ordering Principle usually are proofs by 
contradiction, whereas proofs using Mathematical Induction are usually more direct. This 
might make it appear that, whenever possible, it is preferable to use Mathematical Induction as 
opposed to the Well Ordering Principle. However, there is another important factor to consider. 
The Well Ordering Principle, at first glance, seems much more believable than Mathematical 
Induction. Students often feel that there is some hocus pocus going on in a Mathematical 
Induction proof. Mathematical Induction allows us to assert that a certain set T contains all 
positive integers provided it satisfies properties (a) and (b). Unfortunately, students are often 
extremely unnerved by the arguments used in proving that T satisfies property (b). 


In proving that 7 satisfies property (b), we are actually showing that if T contains a number k, 
then it also contains the number k + 1. Note that we are not assuming that T contains k, merely 
saying that if T contains k, then it also contains k + 1. However, at this point many students 
feel that we are assuming that T contains all positive integers k and they feel that we are 
making the error of assuming what we are trying to prove. Although we are not assuming what 
we are trying to prove, it is difficult for many students to rid themselves of that feeling, and, 
because of this, many students do not trust or believe Mathematical Induction. Hopefully, after 
several additional readings of the proof of how Mathematical Induction follows logically from 
the Well Ordering Principle, you will become more comfortable with Mathematical Induction 
and accept that it is a valid tool to use. 


A common way to visualize Mathematical Induction is to think about dominoes. Suppose 

you are setting up a long row of dominoes. You want to set them up in such a way that after 
you knock down the first domino, all the dominoes will eventually fall. What do you need to 
do to guarantee that all the dominoes will fall? In particular, you need to make sure that the 
11th domino is positioned so that when the 10th domino falls, it will knock down the 11th. 
Similarly, you need to make sure that the 8456th domino is positioned so that when the 8455th 
domino falls, it will knock down the 8456th. More generally, we need to make sure that 
domino number k + 1| is positioned so that when domino number k falls, it will knock down 
domino number k + 1. Suppose we now let U denote the set of dominoes that eventually fall at 
some point after we knock down the first domino. What properties does the set U have? Since 
we are knocking down the first domino, U certainly contains the first domino. Furthermore, 
we have arranged the dominoes such that if U contains domino number k, then U will also 
contain domino number k + 1. Thus, U satisfies properties completely analogous to properties 
(a) and (b) in Mathematical Induction. Just as all the dominoes will eventually fall, the set T in 
Mathematical Induction will contain all positive integers. 


There are other useful aspects of the domino analogy. When doing a proof by Mathematical 
Induction, we need to justify that our set T satisfies properties (a) and (b). In most proofs, the 
harder case is showing that T satisfies property (b). As a result, students often consider 
property (a) to be relatively unimportant. However, if we think about our row of dominoes, 


32 Chapter 2 


nothing happens until we knock down that first domino. Not a single domino will ever fall if 
we fail to knock that first one down. Hopefully, you can see that knocking down the first 
domino is analogous to T satisfying property (a). Indeed, if we do not show that T contains 1, 
then we can never invoke property (b) and we will be unable to prove that T contains any 
integers. 


For a slight variation, suppose you set the dominoes up as before but instead you ignore the 
first few dominoes and then knock the 4th domino toward the Sth. In this case, all the 
dominoes starting with the 4th will eventually fall. Similarly, if you chose to ignore the first 20 
dominoes but then knocked the 21st domino toward the 22nd domino, then all the dominoes 
starting with the 21st will eventually fall. In light of this, it is easy to see that we can restate 
our first version of Mathematical Induction in a form that looks more general. 


Mathematical Induction—First Version Revisited. Let T be a subset of N satisfying the 
following two properties: 
(a) T contains the number m. 


(b) Whenever T contains a number k, where k > m, then it also contains the number k + 1. 


Then T contains all positive integers greater than or equal to m. 
At this point, we should give several examples of how Mathematical Induction can be used. 
Problem 2.5: The Two-Color Problem for Planes Divided Up by Lines 


A famous question in mathematics asks, “What is the minimum number of colors needed to 
color any map?” 


When coloring a map, our only demand is that any two regions that share a boundary must be 
colored in different colors. The following picture indicates that, in general, three colors are not 
sufficient, as the four regions in the picture must all be different colors. In one of the great 
mathematical achievements of its era, it was shown in the 1970s that four colors can be used to 
color any map. 


Proof and Intuition 33 


However, we will look at a simpler problem. Suppose the plane is split up into different 
regions using only straight lines. What is the minimum number of colors needed to color any 
such map? For an example of the type of situation we are dealing with, consider the following 
picture. Note that by putting the numbers 1| and 2 into the various regions, we can see that we 
can color this map with only two colors. 


At this point, you should draw several pictures and consider cases that have 4, 5, or 6 lines 
dividing the plane and see how many colors you need in each case. Presumably, you will 
observe that in each of the cases you looked at, two colors sufficed. The question is, how 
would you go about proving that two colors always suffice? The proof will proceed by 
Mathematical Induction. 


Proof: We need to show that no matter how many lines are used to divide up the plane, we can 
color the plane using only two colors. To this end, we let 


T= {n € N | whenever a plane is divided up by n lines, 


it can be colored with 2 colors}. 


We will be done if we can show that T contains all positive integers, and we will do this by 
applying Mathematical Induction. Therefore, we need to show that T satisfies properties 

(a) and (b). To show that T satisfies property (a), we need to consider the situation where there 
is only one line cutting across the plane. Since the line splits the plane into only two regions, 
we can use one color for the first region and a second color for the second region. So we see 
that two colors certainly suffice in this case and T contains 1. 


To show that T satisfies property (b), we are in the situation where 7 contains some positive 
integer k, and we need to show that T contains k+ 1. So suppose we are given a plane that is 
divided up by k+ 1 lines. We need to show that we can color this plane using only two colors. 
The following picture will illustrate the procedure that we are about to discuss. 


34 Chapter 2 


Original With one line removed Line reinserted 


Focus on one particular line, and, for the sake of simplicity, rotate the plane so this particular 
line is now horizontal. For the moment, erase that particular line from the plane, thereby 
leaving only k lines on the plane. Since we now have a plane divided up by only & lines and 
since T does contain the number k, we can color this plane using only two colors. In doing 
this, keep in mind that regions that share a border must be different colors, but regions that 
have only a point in common do not need to be different colors. 


The key idea in this proof is that when we place line k+ 1 back on the plane, we reverse the 
color of all the regions that lie above this line. Let us examine the effect this has on the 
coloring of the plane. If two regions share a border, then part or all of their border could be 
below line k + 1, part or all of their border could be above line k + 1, or their entire border is 
along line k+ 1. Any two regions that share a border below line k + 1 had already been 
colored in different colors before we reinserted line k + 1. Since no changes were made below 
line k + 1 when it was reinserted, these regions are still different colors. Similarly, any two 
regions that share a border above line k + 1 had already been colored in different colors 
before we reinserted line k + 1. However, when we reinserted line k + 1, all the colors above 
the line were reversed. Therefore, if two regions above line k + 1 had different colors before 
line k + 1 was reinserted, they still have different colors after their colors have been 
reversed. 


The only remaining case to consider is two regions whose border lies along line k + 1. Before 
line k+ 1 split these regions into two separate entities, they combined to form a single region 
and were therefore the same color. However, when line k + 1 is reinserted, the upper part of 
the region has its color reversed, so it is a different color from the lower part of the region. As 
a result, in all three cases we see that after line k+ 1 is reinserted, regions with a common 
border must be different colors. Therefore, T does contain the number k + 1, and T is indeed 


the entire set of positive integers. 


With experience, you will see that Mathematical Induction is a wonderful tool for proving 
things that you already believe to be true but have not yet proven to be true. However, the 


Proof and Intuition 35 


more important and exciting part of mathematics is the creativity and experimenting needed in 
deciding what we think is true. In the previous problem, the really interesting part was 
realizing, after experimenting with several examples, that two colors did apparently suffice. 
Mathematical Induction played the important role of enabling us to prove that two colors did 
indeed suffice, but more importantly, we first needed to understand the problem well enough to 
suspect that two colors sufficed. 


Problem 2.6: For any positive integer n, 


3n*—n 


14+44+7+4-+-+@Gn-2) = 


Intuition. Before trying to solve this problem, we should first try to get a feeling for what the 
problem is saying. When n = 1, the left-hand side of the equation is the sum of all numbers of 
the form 3n — 2 starting with 1 and ending with 3(1) — 2. In other words, the left-hand side is 
merely the number 1. On the other hand, the right-hand side is ae which is also equal to 1. 
For a more interesting case, when n = 5, the left-hand side of the formula is the sum of all 
numbers of the form 3n — 2 starting with 1 and ending with 3(5) — 2. Therefore, the left-hand 
side is 


1+-4-+7-+10-+- 13. 


On the other hand, the right-hand side is or and we see that both sides are equal to 35. 


For large values of n, the preceding formula tells us that computing the sum of a long list of 
numbers can be greatly simplified by merely plugging the value of n into the quadratic 
polynomial ee Many of the most common problems and exercises involving Mathematical 
Induction will be of this type. We will frequently be asked to prove that computing sums of 
long lists of numbers often simplify into plugging one number into a polynomial. Although 
problems of this type serve as excellent practice, they are somewhat misleading as an indicator 


of how mathematics is really done. In this example, we will be using Mathematical Induction 


to verify that a certain formula is true. However, far more creativity and mathematical thinking 
takes place in actually finding a possible formula. 


Mathematical Induction enables us to prove that various formulas work, but the more 
important and interesting piece of these problems is in finding these formulas. In Chapter 13, 
we will see how to derive many of the formulas that seem to magically appear in textbooks as 
exercises on Mathematical Induction. 


Proof: We will let 


aa 


P= |neN|1+4474---4(n=2)= ; 


36 Chapter 2 


We will need to show that T = N, and we will do this by using Mathematical Induction. As 

we just saw, if we plug n = 1 into both sides of the formula 1+4+7+---+(@Gn—2)= — 
we obtain the equality 1 = 1. This tells us that T contains 1, and therefore T satisfies property 
(a). In order to show that T satisfies property (b), we now need to show that if T contains a 
positive integer k, then it also contains k + 1. Therefore, we may assume that k is a particular 


positive integer such that 


3k? —k 
(1) 144474---+Bk-2) = 


We need to show that T contains k + 1. If we letn = k-+1 in the formula 


3n2—n 
2 % 


1+4+7+---+(n—2)= 


we see that we need to show that 


3(k+1)?—(k+1) 
5 


(2) 1+44+7+---+@6(k+1)-2)= 
If we observe that the next-to-last summand of the left-hand side of (2) is 3k — 2, we have 

14+4474+---+(3(K+1)—2) =14+4474+---+Gk-2)+ G6k+1). 
However, if we plug equation (1) into our last equation, we obtain 


ta] Pi EOE EHD) SO i ORE 2) = 
32 —k 


+ (3k +1). 


Since 


3k? —k 3k? +5k+2 
+ Bk+ I) = 


we see that the left-hand side of (2) is equal to gaeees On the other hand, it is easy see that 


3(k+1)?-3(kK+1)  3k?+5k+2 
2 i 2 , 


so both sides of (2) are equal to BEA Skt Thus, 7 contains k+ 1, and we have shown that T 
satisfies property (b), thereby concluding our proof. 


There is another version of Mathematical Induction that, depending on the problem you are 
studying, may be more useful. 


Proof and Intuition 37 


Mathematical Induction—Second Version. Let T be a subset of N satisfying the following 
two properties: 


(a) T contains 1 


(b) whenever T contains the set of numbers {1,2,..., k}, then it also contains the number 
k+1. 


Then T contains all positive integers. 


If you understood the proof of the first version of Mathematical Induction using the Well 
Ordering Principle, then it should be fairly easy for you to use the Well Ordering Principle to 
prove the second version of Mathematical Induction. In fact, there are many slightly modified 
versions of Mathematical Induction that may be of use to you depending on the problem you 
are studying. All of these versions follow from the Well Ordering Principle, and some of them 
will appear as exercises at the end of this section. 


When reading a mathematics text, we repeatedly come across the words lemma, theorem, 
corollary, and proposition. Usually the text contains no explanation of what these terms 
mean. Theorems are important mathematical facts. In mathematics, when you feel that you 
have succeeded in proving something of substance, you almost always call it a theorem. Often, 
in order to prove a theorem, you first need to prove a series of smaller mathematical facts, and 
we call these lemmas. In some sense, lemmas are identical to theorems, as they are both 
mathematical statements that require proofs. However, the difference is that we usually 
reserve the name theorem for our major results, and we call the smaller facts leading up to 
them lemmas. Once we have proven a theorem, it is often possible to use the theorem to prove 
other interesting facts. We call these corollaries. Corollaries are also mathematical statements 
that require proof. To make an analogy, one could say that a lemma is to a theorem as a 
theorem is to a corollary as, in both cases, the first is used to prove the second. A proposition is 
yet another name for a mathematical statement that requires proof. Usually an author will call 
a statement a proposition if they consider it to be important but not as important as a theorem. 
Also, a symbol that appears in many textbooks and has appeared several times in this chapter 
is LJ. It is used to denote the end of a proof. 


Exercises for Sections 2.1, 2.2, and 2.3 


1. Ifris areal number such that 1+, > 0, prove that 
(l+r)">=1+m, 


for all integers n € N. 


38 


Chapter 2 


In precalculus courses we saw that 
In(rirz) = In(ry) + In(7y) and eS!) = ee, 


for all positive real numbers r,, rz and real numbers s 1, 52. Using these facts, prove 
(a) ifn e€Nandr,,ro,...,7, are positive real numbers, then 


In(ry -r2-+ +r) =In(ry) + Ina) +--- + In); 


(b) ifneN andrj,ro,..., 7, are real numbers, then 


2, Tn 


elitrte tim) — fl, Le”, 


e 
In calculus we learned that 
(x) =1 and (f(x)g@))' = fe) + fa)g'(x), 


for all differentiable functions f(x), g(x). Using these facts, prove 
(a) (x") =nx"!, for alln EN; 


(b) (f(x)")’ =nf(x)""! f(x), for all n € N and differentiable functions f(x). 
Suppose F and G are functions both of whose domain is the real numbers such that 
Fri +r2) = Fr) + F(r2) and) G(r) +12) = G(r1)- G72), 


for all 1, rz in the real numbers. 
(a) Prove that ifn € N and 7 ,,72,..., 7, belong to the real numbers, then 


Fr) +12 +++ +9n) = Fi) + Fra) +++ + Fn). 
(b) Prove that ifn € N and7r),r2,...,7, belong to the real numbers, then 


G(r) -12°++Tn) = G(r1)- G(r2) ++» Gn). 


Prove that 
3 2) 
24+54+8+--+Gn—-l)= = 
for alln EN. 
Prove that 


2 4n3—n 


1743745°+---+(2n-1) on 


for alln EN. 


ds 


Prove that 


Pe AP nt? = 


for alln EN. 


For exercises 8-9, please first read the following: 


The Fibonacci sequence F,, is defined as 


10. 


11. 


12. 


13. 


14. 


15. 


Proof and Intuition 39 


n>(n+ 1)? 


4 ’ 


F,=1, Fo=l, and Fyyo=Fit Fas, forallneN. 


Prove that, for any n €N, the largest positive integer that divides both F,, and F,+, is 1. 


Prove that, for any n € N, \(Fr41)? — Fyi42Fh 


—- 


If a, r are real numbers, with r 4 1, prove that 


atar+---tar*!4+ar = = 


for alln EN. 


The triangle inequality states that if 7;, r2 are real numbers, then 


Ir) +r2| < |r| + |rol. 


Use this fact to prove that ifn € N and rj, ro, . 


..,/n are real numbers, then 


ry trotess tral < |ril+lrelt+:--+|ral- 


Prove that if n € N, then 


1 1 1 
+ 


1 n 


[2° oS 34 nee aa 


Prove that if n € N, then 


1 1 1 


1 


3n? +5n 


13°24 °35 awed 4@4D042 


Prove that if n € N, then 


1 1 1 1 


11n3 +48n2 + 49n 


+ 


1-4 D5 Rb Rey Oa 


Prove that if a set has n elements, then it has 
for every integer n > 2. 


n(n—1) 
2 


subsets with exactly two elements, 


40 


16. 


17. 
18. 


19. 


20. 


21. 


22. 


Chapter 2 


n(n—1)(n—2) 
6 


Prove that if a set has n elements, then it has subsets with exactly three 


elements, for every integer n > 3. 
Prove that a set with n elements has 2” subsets, for every n € N. 


Let s, be the sequence defined as 


2 
s;=11 and Snt1 = 3 Gn +5), for alln EN. 


(a) Prove that s, > 10, for alln € N. 
(b) Prove that s,41 <s,, for alln €N. 


Let t, be the sequence defined as 


2 
t} =9 and int = 3 Un +5), for alln € N. 


(a) Prove that ¢, < 10, for alln € N. 
(b) Prove that t,41 > t,, for alln € N. 


Let u,, be the sequence defined as 


3 
u;=5 and Un+1 = 5 Un + 12), for alln EN. 


(a) Prove that u, < 18, for alln EN. 
(b) Prove that uyj41) > uy, for alln EN. 


Let v, be the sequence defined as 


3 
vy =20 and vy, = 5 Un +12), forallneN. 


(a) Prove that v, > 18, for alln € N. 
(b) Prove that v,41 < vp, for alln EN. 


In calculus it is shown that if f(x), g(x) are continuous functions, then so is f(x) + g(x). 
Use this fact to prove that ifn € N and fi (x), fo(x),..., f, (x) are continuous functions, 
then so is fi (x) + fox) +-+->+ fn). 


Proof and Intuition 


23. Let F bea function whose domain is N, which is defined as 


1)F 
Fl) =5 and Fing i= O40"), for alln e N. 


Prove that F(n) = 5(#1), for alln € N, where n! = 1-2-3---(n—1)-n. 


24. Let G be a function whose domain is N, which is defined as 
G(1)=8 and G(n+1)=G(n)+3n?+7n+3, forallneN. 
Prove that G(n) =n? +2n? +5, forall n €N. 
25. Prove that n? > 6n—5, for all integers n > 6. 
26. Prove that n? > 11n —24, for all integers n > 9. 


27. Prove that 


bie sh is pout 
— os  —— + ee S15 
12. 22 n n 


for all integers n > 2. 


28. Prove that n> + 5n is a multiple of 3, for all n €N. 


For exercises 29-34, please first read the following: 


You will frequently come across mathematical statements that look like they can be proven 


41 


with Mathematical Induction, yet all the versions of Mathematical Induction you have already 


seen may not apply. In light of this, it will often be necessary to use different versions of 
Mathematical Induction. All of these additional versions of Mathematical Induction can be 


proved using the Well Ordering Principle. In exercises 29-32, you will use the Well Ordering 


Principle to prove additional versions of Mathematical Induction. Then, in exercises 33-34, 
you may need to apply some new versions of Mathematical Induction to prove that various 


mathematical statements are true. 


29. In section 2.3 of this chapter, we stated but did not prove Mathematical Induction— 
Second Version. Prove this version of Mathematical Induction. 

30. Prove the following: 
Let T be a subset of the integers satisfying the following two properties: 


(a) 7 contains the number m 


(b) whenever T contains a number k, where k > m, then it also contains the number 
k+1. 
Then 7 contains all integers greater than or equal to m. 


42 


31. 


32. 


33. 


34. 


Chapter 2 


Prove the following: 


Let T be a subset of N satisfying the following two properties: 
(a) T contains 2 


(b) whenever 7 contains a number k, then it also contains the number k + 2. 


Then T contains all positive even integers. 


Prove the following: 


Let T be a subset of N satisfying the following two properties: 
(a) T contains 3 


(b) whenever 7 contains a number k, then it also contains the number k + 3. 


Then T contains all positive integers that are multiples of 3. 

(a) Prove that 9” — 1 is a multiple of 8, for all € N. 

(b) Prove that 9” — 1 is a multiple of 80, whenever n is a positive even integer. 
(a) Prove that 7” — 1 is a multiple of 48, whenever n is a positive even integer. 


(b) Prove that 7” — 1 is a multiple of 2400, whenever n is a positive integer which is a 
multiple of 4. 


For exercises 35—36, please first read the following: 


Throughout these exercises, you have used Mathematical Induction to verify formulas. 
However, more important and far more exciting than verifying formulas is finding formulas. In 
the next two exercises, you will be asked to (i) try to find a formula and (11) prove the formula 
you found does indeed work. These exercises may shed some light on where the formulas in 
exercises 5—7 came from and also serve as a preview of some of our work in Chapter 13. 


35; 


Consider the sum 
14+345+---+(Qn-1). 


Suppose you suspect that the preceding sum is equal to a quadratic polynomial. If so, 

there would exist real numbers a, b, c such that 1+3+5+---+(2n—1) =an?+bn+c. 

(a) Plug three different values of n into the preceding equation to obtain three linear 
equations in a, b,c. 


(b) Solve for a, b, c. At this point, you will have derived a formula that you know works 
for at least three values of n. 


(c) Use Mathematical Induction to prove that the formula you found in part (b) holds 
for alln EN. 


Proof and Intuition 43 


36. Consider the sum 
17427437 4---+n?. 
Suppose you suspect that the preceding sum is equal to a cubic polynomial. If 
so, there would exist real numbers a, b, c, d such that 12+ 27 +32 +---+n? = 
an? + bn? +cn +d. 
(a) Plug four different values of n into the preceding equation to obtain four linear 
equations in a, b, c,d. 


(b) Solve for a, b, c,d. At this point, you will have derived a formula that you know 
works for at least four values of n. 


(c) Use Mathematical Induction to prove that the formula you found in part (b) holds 
for alln EN. 


For exercises 37-40, please first read the following: 


Although there are many different versions of Mathematical Induction, they all contain two 
parts. Therefore, anytime you use Mathematical Induction to prove something, you will need 
to verify that both parts of Mathematical Induction are satisfied. It is often the case that 
verifying part (a) of Mathematical Induction is quite routine, whereas verifying part (b) can be 
quite difficult. As a result, students sometimes fall into the trap of thinking that a proof using 
Mathematical Induction requires only that you verify part (b). However, there are countless 
statements that are clearly false even though part (b) of Mathematical Induction can be easily 
verified. A very simple example is the statement: 


All positive integers are greater than 1,000,000. 


Obviously, this statement is false, but observe that part (b) of Mathematical Induction is 
satisfied if k is a positive integer that exceeds 1,000,000, then k + 1 is certainly also 

a positive integer that exceeds 1,000,000. In the next four exercises, you will examine other 
mathematical statements that are false despite the fact that part (b) of Mathematical Induction 
holds. 


37. In doing this exercise, you may want to refer back to exercise 33. 
(a) Consider the statement: 


9” — 1 is divisible by 9, forallneN. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


(b) Consider the statement: 


9" — | is divisible by 81, whenever n is a positive even integer. 


44 Chapter 2 


38. 


39. 


40. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


In doing this exercise, you may want to refer back to exercise 34. 
(a) Consider the statement: 


7" — 1 is divisible by 49, whenever n is a positive even integer. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


(b) Consider the statement: 


7" — 1 is divisible by 2401, whenever n is a positive integer 
which is a multiple of 4. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


In this exercise, the sequences s,, and ft, are those defined in exercises 18 and 19. 
(a) Consider the statement: 


Ss, <10, forallneN. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


(b) Consider the statement: 
t, > 10, forallneN. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


In this exercise, the sequences u, and v, are those defined in exercises 20 and 21. 
(a) Consider the statement: 


u,>18, forallne N. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


(b) Consider the statement: 
Un < 18, forallneN. 


Explain why this statement is false, and then show that part (b) of Mathematical 
Induction is satisfied. 


41. 


42. 


Proof and Intuition 45 


Let a be a real number, and define the sequence s,, as follows: 


(i) 


(ii) 


(iii) 
(iv) 

(v) 
(vi) 
(vii) 


1 
Ss; =a and i = for alln € N. 


Consider the statement: 
S,>1, forallneN. 


Explain why part (b) of Mathematical Induction is satisfied. 


Consider the statement: 
S,<1, forallneN. 


Compare this statement to the statement in part (i), and then explain why part (b) 
of Mathematical Induction is again satisfied. 


Analyze whether the statements in parts (i) and (ii) are true. 
Show that if @ > 1, then the statement in part (i) is true. 
Show that if a < 1, then the statement in part (ii) is true. 
What can you say about the sequence if a = 1? 


What does this tell you about the importance of part (a) of Mathematical 
Induction? 


Let a and n be positive integers such that a!/” is a rational number. 


(a) 
(b) 


(g) 
(h) 


Show that a’”/” is rational, for every m € N. 


Show that there exists a smallest positive integer ¢ with the property that f-a'/” is an 
integer, for alll <i<n—-1. 


Show that there is a positive integer b with the property that 0 < a!/"—b <1. 
Show that t(a!/” — b) is a nonnegative integer that is less than f. 


Show that t(a!/” — b) also has the property that (t(a!/” — b)) -a'/" is an integer, for 
alll <i<n-l. 


Show that t(a!/” — b) = 0. 
Conclude that a!/” is a whole number. 


Use parts (a)—-(g) to prove that if a and n are positive integers, then either a!/” is a 
whole number or is not rational. 


46 Chapter 2 


2.4 Functions and Binary Operations 


Functions play an important role in virtually every branch of mathematics. In abstract 
algebra not only do we use functions to examine the structure of objects like fields, rings, 
and groups, but sets of functions often form important algebraic objects in their own right. 


In calculus, you examined the graphs and other properties of functions like 
f(x) =e", g(t) =—167+400, and h(6) =tan(6). 


At that point, almost all the functions you looked at were represented by a fairly straightforward 
rule or formula. For that reason, we informally defined a function as a “rule” that assigns to each 
element in a set an element in a second set. 


This intuitive approach is a very appropriate way to look at functions in calculus where many 
functions arise in an attempt to describe or model real-world situations. Therefore, it comes as 
no surprise that most functions in calculus are represented by a straightforward rule or 
formula. As we have indicated at various points in this chapter, mathematics usually requires a 
mix of intuition and rigor. Defining a function as a “rule” works quite well in many contexts 
such as calculus. However, since we don’t really know exactly what constitutes a rule, a more 
rigorous approach is sometimes needed. As will be the case throughout this book, rigorous 
definitions will often be preceded by intuitive examples. 


When we look at f(x) = e*, we know that 


f@)=1, f)=e, and f(in(5)) =5. 


Therefore, three of the ordered pairs that belong to the graph of f are (0, 1), (1, e), and 
(in(5), 5). Observe that if we let R x R denote all ordered pairs of real numbers, then every 
element of the graph of f is an element of the set R x R. Thus, we can think of the graph 
of f as being a particular type of subset of R x R. 


For an example far removed from calculus, let V = {a, b, c} and W = {1, 2}. Next, suppose 
G:V— Wis the function such that 


G(a)=2, G(b)=1, and G(c)=1. 


Then we can think of the graph of G, or simply think of G, as the ordered pairs {(a, 2), (b, 1), 
(c, 1)}. If we let V x W denote all ordered pairs where the first term belongs to V and the 
second term belongs to W, then we can think of the function G as being a subset of V x W. 


However, not all subsets of V x W fit our intuitive notion of a function. Consider the ordered 
pairs 


{(a, 2), (b, 1), (b, 2), (ce, } C V x W. 


Proof and Intuition 47 


If this represented some function H, then we can ask ourselves, what is H(b)? The ordered 
pair (b, 1) indicates that H(b) should be 1. On the other hand, the ordered pair (b, 2) indicates 
that H(b) should be 2. When this type of ambiguity arises, we will not consider the ordered 
pairs to represent a function. 


There is yet another issue to consider. Using the same V and W as in the previous paragraph, 
let’s examine the ordered pairs 


{(a, 1), (c,2)} C Vx W. 


If these ordered pairs represent a function J, then what is J(b)? We will deal with this situation 
somewhat differently than you might have done in the past. In calculus, you examined the 
function h : R — R defined as h(@) = tan(@). Note that h is not defined whenever 0 is of the 
form a +nm, for all integers n. However, we still wrote h : R — R, even though there are 
many elements in R for which h is not defined. In the more formal approach typically used in 
abstract algebra, we will handle things differently. When we refer to a function f from a set S$ 
to a set T, we will insist that f(s) is defined for every element of S. 


Given sets S and T, we can now put all the pieces together and describe exactly which subsets 
of S x T can represent a function f from S to T. We first observed that every element of S 
occurs in at most one ordered pair of f. Next, we observed that every element of S must occur 
in at least one ordered pair of f. Together, these observations lead us to 


Definition 2.7. If S and T are sets then a function f : S — T is a subset of S x T such that 
every element of S belongs to exactly one ordered pair of f. 


Given a function f : S > T, we call S the domain of f. If an ordered pair (a, b) belongs 
to f, we write f(a) = b and say that b is the image of a. Furthermore, we call the set 
{be T|b= f(@, for some a e€ S$} the range of f. Thus, the range of f is the subset 

of T consisting of all values of the function /. 


Example 


Let S = {a, B} and T = {x, y, z} and let f : S > T be the function {(a, y), (8, x)}. Then, in 
this case, f(a) = y, f(B) =x, and the range of f is the set {x, y}. 


We will now contrast the previous function f to the new function g: S — T defined as 

{(a, z), (B, z)}. One big difference between f and g is that g “repeats values.” More precisely, 
g takes on the value of z twice as g(a) = z = g(f). We now give a special name to functions 
that never repeat values. 


48 Chapter 2 


Definition 2.8. Let f : S— T be a function. We say that f is injective or one-to-one if f 
never repeats any values. This means that whenever a and b are different elements of S, then 
S(@ and f(b) are different elements of T. 


m Example 


Let f: R— Rand g:R—R be given by the formulas f(x) = x* and g(x) = x°. 
Observe that f is not injective, as there are many cases where a £ J, yet f(a) = f(b). 
For example, f(5) = f(—5). However, g is injective, for ifa,b € Rand a¥¢b, then 
g(a) =a #b = g(b). 


If you suspect that a function  : S — T is injective, then an approach to take in trying to prove 
it is injective is to assume that 5s, s2 € S such that h(s,) = h(s2) and then show that sj = sp. 


Example 


Leth: R— R be given by h(x) = 3x —5. To show that / is injective, suppose a,b € R 
such that h(a) = h(b). Thus, 3a —5 = 3b—5. But this quickly implies that 3a = 3b, which 
then yields a = b. Thus, h is indeed injective. 


To illustrate another important property of functions, consider the following: 


m Example 


Let S = {1,2,3,4} and T = {a, b,c}. Next, let m : S — T be given by the ordered pairs 
{(1, c), (2, c), (3, a), (4, a)} and let n : S — T be given by the ordered pairs {(1, b), (2, a), 
(3, a), (4, c)}. Observe that the range of m is not all of T, as b is not a value of m. On the 
other hand, the range of n is all of T, as every element of T is indeed a value of n. This 
leads to 


Definition 2.9. Let f : S > T be a function. We say that f is surjective or onto if the 
range of f is all of T. This means that for every t € T, there exists some s € S such that 


f(s) =. 


www-.elsevierdirect.com 


Proof and Intuition 49 


To illustrate our next concept, let us look at the following: 


m Example 


Let § = {1,2, 3}, T = {a, B}, and W = {a, b, c}. Next, let g: S > T be the function 
{(1, B), (2, a), (3,a@)} and f : T > W be the function {(a, c), (f, a)}. 


Observe that we can create a new function from S to W, which we will denote at fog, by 
first “plugging” elements of S into g and then plugging the result into f. More precisely, 


(fog) = f(gQ)) = f(B) =a, 
(fo g)(2) = f(g2)) = fla) =e, 
(f° g)G3) = f(sG)) = f@ =e. 


Therefore, the new function fog: S — W consists of the ordered pairs {(1, a), (2, c), 


(3, c)}. 


Definition 2.10. If g:S— T and f : T > W are functions, then the composition 
fog: S— W is the function consisting of all ordered pairs of the form (s, f(g(s))), 
where s € S. 


Now let us push the previous example a bit further. 


m Example 


Let S, T, W, g, and f be as in the previous example, and now let V = {x, y, z, w}. Next, let 
h: V — S be the function consisting of the ordered pairs {(x, 2), (y, 1), (z, 2), (w, 3)}. 
Therefore, in addition to the composition fog: S — W, we can also look at the 
composition goh: V — T. Some straightforward computations tell us that 


(goh)(x) = g(h(a)) = g(2) =a, 
(goh)(y) = g(A(y)) = g()) =B, 
(goh)(z) = g(h(z)) = g2) =a, 
(goh)(w) = g(h(w)) = g(3) =a. 


Therefore, goh is the subset of V x T consisting of the four ordered pairs 


REA a), (y, P), (z, a), (w, a)}. 


www-.elsevierdirect.com 


50 Chapter 2 


At this point, we can now form two new compositions (fog)oh and fo(goh), both of 
which are functions from V to W. Computing the ordered pairs in (fo g) oh, we have 


((fog)oh)(x) = (fog)(h(x)) = (fog)2) = f(g2)) = fla) =c, 
((fog)oh)(y) = (fog)(h(y)) = (fog) = f(g) = f(®) =a, 
((fog)oh)(z) = (fog)(h(z)) = (fog)(2) = f(g(2)) = f@=c, 
((fog)oh)(w) = (fog)(h(w)) = (f o8)(3) = f(gG3)) = f(a) =c. 


On the other hand, when we compute the ordered pairs in fo(goh), we have 


(fo(goh))(x) = f((goh)(x)) = f(gA))) = f(g2)) = f(a) =c, 
(fo(goh))(y) = f((goh)(y)) = f(g@Qy))) = f(g) = f(B) = 4, 
(fo(goh))(z) = f((goh)(z)) = f(g(h(z))) = f(g(2)) = f(a) =c, 
(fo(goh))(w) = f(goh)(w)) = f(g(h(w))) = f(gG3)) = f(@) =c. 

= a 


Observe that in the preceding example the ordered pairs for both (fog) oh and fo(goh) are 
{(x,c), (y, a), (z, c), (w, c)}. Therefore, the functions (fo g)oh and f o(goh) are the same. 
This is no coincidence, and it leads us to 


Theorem 2.11. The composition of functions is associative. This means that if f : C > D, 
g:B—C,andh: A— Bare functions, then the functions (f og)oh: A— Dand 
fo(goh): A— Dare the same and consist of all ordered pairs in A x D of the form 

(a, f(g(h(a)))), where ae A. 


Intuition. Before providing a very short formal proof, we provide a more informal 
discussion of what is taking place. Observe that if a € A, then when we compute where 

(f og)oh sends a, we first plug a into h and then plug h(a) into f o g. But plugging h(a) into 
fog means first plugging h(a) into g to obtain g(h(a)) and then plugging g(h(a)) into f to 
give us the final answer of f(g(h(a))). 


On the other hand, if a € A, then to see where f 0(go0h) sends a, we first plug a into goh. 
This gives us an a value of g(h(a)), which next gets plugged into f to give us f(g(h(a))). 
The bottom line is that regardless of whether a € A is plugged into (fog)oh or fo(goh), we 
end up with a final value of f(g(h(a))). 

Proof. If ae A, then 


((fo g)oh)(a@) = (fog)\h@) = f(gh@)), 


www-.elsevierdirect.com 


Proof and Intuition 51 


whereas 


(fo(goh))(a) = f(g oh)(a)) = f(g(h(a))). 


Thus, both (fo g)oh and fo(goh) send ato f(g(h(a))) and therefore consist of all ordered 
pairs of the form (a, f(g(h(a)))), where a € A. 


Let us stop for a moment and see where we are. If S is a set, we will let F(S) = {f:5— S}. 
Thus, F(S) is the set of functions from S$ to S. In light of Definition 2.10, whenever f g € F(S) 
the composition f og € F(S). Therefore 


(i) (CS) is a set and 


(ii) given any two elements f g € F(S), there is a way of combining them, denoted as o, to 
obtain a third element fog of F(S). 


(Note: When we talk about two elements of F(S) or a third element of F(S), we are not 
making any assumptions about whether they are all different or whether some of them might 
be the same.) 


Therefore, we are in a situation where we can always combine two elements of a set to obtain 
a third. As we will see in the following examples, this is a very common situation. 


m Examples 


1. Let IR be the set of real numbers and let o denote ordinary multiplication. Then, 
whenever a, b € IR, we haveaob=abeR. 


2. Let Rt be the set of positive real numbers and let o denote ordinary addition. Since 
the sum of two positive real numbers is a positive real number, we see that whenever 
a,b € Rt, wehaveaob=a+beR'. 


3. Let Q be the set of rational numbers and let o be defined asaob=a+b-+ab, for all 
a,b €Q. It is easy to see that ifa,b€ Q, thenaobeQ. 


4. This example will differ from the first three. In this case, we again consider the set Rt 
of positive real numbers, but this time we let o denote ordinary subtraction. In this 
situation, there are many positive real numbers a, b such thataob=a—be R°™. 
However, there also exist some a, b € R™ such that aob ¢ R™. For one out of many 
possible examples, we have 4,7 € Rt, but4o7 =4—7=-3 ¢R?. 


In our first example, we can think of o as a function whose domain is R x R and whose range 
is contained in R. Similarly, in the second and third examples, o can be thought of as a 


www-.elsevierdirect.com 


52 Chapter 2 


function whose domain is of the form S' x S and whose range is contained in S. But things 

are different in the fourth example. In that example, o is a function whose domain is Rt x Rt, 
but the range is not contained in Rt. To better describe the difference between the fourth 
example and the first three, we have 


Definition 2.12. A binary function o ona set S is a function whose domain is S x S. If we 
are in the case where o: S x S — S, then we say that S is closed under o and say that o is a 
binary operation on S. 


Applying the language of Definition 2.12 to our previous examples, we have 
(i) multiplication is a binary operation on the set R, 
(ii) addition is a binary operation on the set R*, 

(ili) o defined asaob=a+b-+ab isa binary operation on the set Q, 


(iv) subtraction is not a binary operation on the set R* as R* is not closed under subtraction. 
Observe that subtraction is a binary function on Rt but it is not a binary operation as the 
range of o contains negative numbers. 


Furthermore, Definition 2.12 indicates that composition of functions is a binary operation on 
the set of functions from a set to itself. Having already defined injective and surjective 
functions, the next definition should come as no surprise. 


Definition 2.13. Let f : S — T be a function. We say that f is bijective if it is both injective 
and surjective. 


We can now examine whether the properties of being injective, surjective, and bijective are 
preserved when we compose functions. Calculus courses have given most students experience, 
at least on an informal level, with injective and surjective functions. Despite this, students 
often have great difficulty proving statements about how injective and surjective functions 
behave under composition. This might be because we are now forced to deal with topics that 
are much more formal and less intuitive than those we have dealt with previously. Thus, 
reading and writing proofs about the composition of injective and surjective functions require 
a thorough understanding of all the relevant definitions. 


Theorem 2.14. Let f:T — U and g:S— T be functions. 
(a) If f and g are both injections, then the composition f 0 g is also an injection. 
(b) If f and g are both surjections, then the composition f 0 g is also a surjection. 


(c) If f and g are both bijections, then the composition f 0 g is also a bijection. 


Proof and Intuition 53 


Proof: For part (a), we need to show that if a, b € S such that (f o g)(a) = (fo g)(b), then 
a= b. Since (fo g)(a) = f(g(a)) and (f 0 g)(b) = f(g(b)), we see that f(g(a)) = f(g(d)). 


The function f is injective, and g(a) and g(b) give us the same value when plugged into f, 
so g(a) = g(b). But g is also injective, and a and b give us the same value when plugged into 
g, So a= b, as desired. 


For part (b), we need to show that if c € U, then there exists a € S such that (fo g)(a) = c. We 
will do this in two steps. First, since f is surjective, we know that there exists b € T such that 

f(b) = c. Next, the surjectivity of g tells us that there exists a € S such that g(a) = b. Thus, g 

sends a to b and then f sends b to c. More formally, we can write this as 


(fog)(@ = f(g(@)) = fh) =c, 
as desired. 
For part (c), observe that since f and g are both injective and surjective, parts (a) and (b) tell 


us that fo g is also injective and surjective. Thus, f o g is indeed bijective. CJ 


If S is a set, we already saw that composition of functions is a binary operation on the set 

of functions from S to S. If we let Inj(S), Sur(S), and Bij(S) denote, respectively, the 
injections, surjections, and bijections from S to S, then using the language of Definition 2.12, 
it immediately follows from Theorem 2.14 that 


Corollary 2.15. Let S be set and let o denote composition of functions. 
(a) oisa binary operation on the set Inj(S). 
(b) oisa binary operation on the set Sur(S). 


(c) oisa binary operation on the set Bij(S). 


Before concluding this chapter, we will make a few more observations about the set Bij(S). 
Suppose we define e: S — S as e(s), for all s € S, We call e the identity map and note that it 
can be considered to be subset of S x S consisting of all ordered pairs where the first and 
second terms are the same. It is easy to see that e is both injective and surjective, so e € Bij(S). 
Next, if f : S — S then, since f(s) € S, for all s € S, it follows that e( f(s)) = f(s). Therefore, 
if se S, we have 


(foe)(s) = fle(s)) = f(s), 


and 


(eo f)(s) =e(f(s)) = fils). 


54 Chapter 2 


Thus, foe, eo f, and f are all the same function; hence, 


foe=f=eof 


Observe that e plays the same role for the binary operation o that the number 0 plays for 
ordinary addition and the number 1 plays for ordinary multiplication. In particular, no matter 
what function f you compose with e, you always get f back. For this reason, we consider e to 
be the identity element of the sets F(S), Inj(S), Sur(S), and Bij(S) under o. 


Now suppose f € Bij(S) and consider the set of ordered pairs (f(s), 5), where s € S. First, 
since f is surjective, we can see every element of S occurs as the first term in at least one 
ordered pair of this type. Next, since f is injective, no element of S occurs as the first term in 
two different ordered pairs of this type. Therefore, the ordered pairs we described satisfy the 
requirements of being a function from S to S. Since the ordered pairs in this new function are 
the reverse of the ordered pairs of f, we can think of this new function as “undoing” /f. If we 
think of f as pushing points two feet to the right, then this new function would push points 
two feet to the left. For another example, if f was a function that doubled all the numbers you 
plugged into it, then this new function would take half of each number you plugged into it. 


Observe that if you plugged something into f and then plugged the answer into the new 
function, you end up back where you started. If we switch the order and first plug something 
into this new function and then plug the answer into f, we again end up back where we started. 
Therefore, regardless of the order, whenever we apply f and this new function, the end result 
is the same as applying the identity map. Since this new function essentially “inverts” f, we 
denote it as f~! and call it the inverse function of f. Before formalizing things, let’s work 
through 


m Example 


Let S = {1,2,3,4} and let f € Bij(S) be described by the ordered pairs {(1, 3), (2, 2), 
(3, 4), (4, 1)}. Then when we reverse the order of the elements of each ordered pair, we 
see that the function f~! is described by the ordered pairs {(3, 1), (2, 2), (4,3), (1, 4)}. 
Observe that f~! is also a bijection. Furthermore, if you compute the composition 
fof land f-'o f, you will observe that 


(fof IO=ff "Ghat e) 
and 
(fo f(s) = f'(f)) =s =e), 


for all s € S. (You should check this by letting s take on all the values 1,2, 3,4 and 
examining how the two compositions behave.) 


Proof and Intuition 55 


Recall that under ordinary addition, the additive inverse of an element is the element that after 
adding gets you back to 0. Similarly, under ordinary multiplication, the multiplicative inverse 
of an element is the element that after multiplying gets you back to 1. Now, f~! plays that 
same role for f under composition of functions, as it gets us back to the identity map e. We 
now collect and formalize many of our observations. 


Theorem 2.16. Jf S is a set, let Bij(S) denote the bijections from S to S, and let o denote 
composition of functions. Then o is a binary operation on Bij(S) satisfying 


(a) Associative Law: (fog)oh= fo(goh), forall f g,h € Bij(S); 


(b) Identity Element: There is an element e € Bij(S) such that foe= f =eo f, forall 
f € Bij(S); 


(c) Inverses: For every f € Bij(S), there is an element in Bij(S), which we denote as f—"', 
such that fo f-!' =e= f-'of. 


Proof: In Theorem 2.14(c), we showed that o is a binary operation on Bij(S), and in 
Theorem 2.11, we showed that the composition of functions is associative. Furthermore, 

in the discussion leading up to this theorem, we showed that Bij(S) has an identity element 
under o. Therefore, it only remains to prove part (c). 


Let g: T — U bea function, and let us analyze, in terms of the ordered pairs corresponding to 
g inside of T x U, exactly what it means for g to be bijective. Note that g being injective is 
identical to every element of U appearing as the second term in at most one ordered pair of g. 
In addition, g being surjective is the same as every element of U occurring as the second term 
in at least one ordered pair of g. As a result, g being bijective is equivalent to every element of 
U occurring as the second term in exactly one ordered pair of g. 


Now suppose f € Bij(S); we now know that not only does every element of S occur exactly 
once as the first term of an ordered pair of f, but every element of S also occurs exactly once 
as the second term of an ordered pair of f. Therefore, when we let f~! be the function 
consisting of all ordered pairs of the form (f(s), s), where s € S, we can see that f—! satisfies 
the criterion in the previous paragraph for a function to be a bijection. Thus, f~! € Bij(S). 


Since we defined f—! to consist of the “reverse” of all the ordered pairs in f, it follows that the 
ordered pair (a, b) € S x S belongs to f if and only if the ordered pair (b, a) belongs to f~!. 
Now suppose a € S; then the ordered pair (a, f(a)) belongs to f and the ordered pair (f(a), a) 
belongs to f~!. This means that f sends a to f(a) and f—! send f(a) back to a. Therefore, 


(f 'of\@ =f '(f@)=a=e@), 


where e is the identity map. 


56 Chapter 2 


Similarly, if b € S, then when we let c = f~!(b), we know that the ordered pair (b, c) belongs 
to f—!. But this means that the ordered pair (c, b) belongs to f, which is another way of 
saying that f(c) = b. Asaresult, f—! sends b to c and f then sends c back to b. More formally, 
we have 


(fof ')(b) = f(f-'(b)) = fl) = b= eb). 


Thus, fo f-' =e= f'o f, and every element of Bij(S) does indeed have an inverse in 
Bij(S) under o. 


As we will see throughout this book, groups are among the most important objects in abstract 
algebra and are an essential part of Galois’ proof of the insolvability of the quintic. Later, in 
much greater detail, we will discuss the fact that groups are sets with a binary operation 
satisfying the three conditions in Theorem 2.16. Thus, Bij(S) with o, for various sets S, 
provide us with some of the many, many examples of groups we will come across in this 
course. In fact, groups of the form Bij(S), where S is the root of a polynomial, are essential to 
Galois’ work. 


Exercises for Section 2.4 


In exercises 1-3, let S$ = {1,2,3,4,5} and T = {a, b, c} and consider the following six subsets 
of Sx T: 


A={(1,5), 2,5), G,a), G,0), 4,0), 6,0}, 
B={(1,b), 2,6), G,c), 4,0), 6, )}, 
C={(,c), 2,5), 3,5), 4,0), 6, a}, 
D={(1,c), @,¢), G,a), 6, by}, 

E={(1,c), 2,4), G,a), (4,¢), 6, a}, 
F={(,b), (2,c), 3,a), (4,0), (5, D)}. 


1. Which of the six subsets represents a function from S to T? 


2. Using your answer from exercise 1, which of these subsets represents an injective 
function? 


3. Using your answer from exercise 1, which of these subsets represents a surjective 
function? 


Proof and Intuition 


57 


In exercises 4—6, let U = {a, B, y} and V = {x, y, z, w} and consider the following six subsets 
of Ux V: 


={(a, y), (By), (% w)}, 

= {(a, w), (% x)}, 

= {(a, w), (By), (% XDI 

= {(@, y), (8.x), (%2)}, 
={(@, x), (B.y¥),%2),(% w)}, 
= {(a, 2), (By), (% 2). 


Which of the six subsets represents a function from U to V? 


Using your answer from exercise 4, which of these subsets represents an injective 
function? 


Using your answer from exercise 4, which of these subsets represents a surjective 
function? 


In exercises 7—24, let S = {1, 2,3} and let f g,h € Bij(S) be as follows: 


f={0,3),2,2),8,D}, g={0,3),2,),6,2)}, 2={d, D, 2, 3), GB, 2)}. 


In exercises 7—24, you will be doing computations in Bij(S). In your answers, express 
elements of Bij(S) as subsets of S x S. 


7. 
8. 
9. 
10. 
11. 
12. 
13. 
14. 
15. 


16. 


Find fog. 
Find go f. 
Find f7!. 
Find g7!. 
Find f7!og7!. 
Find g-!o f7!. 
Find (fog)7!. 
Find (go f)~!. 


Compare your answers from exercises 13 and 14 to your answers from exercises 11 
and 12 and briefly discuss what you noticed. 


Find goh. 


58 


17. 
18. 


19. 
20. 


21. 
22. 
23. 
24. 


Chapter 2 


Using your answer from exercise 16, find fo(goh). 


Using your answer to exercise 7, find (f og) oh. In light of your answer to exercise 17, is 
this a surprise? 


1 


Find gog. How do gog and g_* compare? 


Based on your answer to the second part of exercise 19, what do you think (go g)o g is 
equal to? Check your answer by computing (go g)o g. 


Find h-!. 
Find g-!oh7!, 
Find h7!og7!, 


In light of the previous exercises, do you expect (goh)~! to equal g~! oh7! or 
h~!og~!? Check your answer by computing (goh)~!. 


In exercises 25-32, f : § + T is defined using the formula f(x) = x’, for all x € S. S and T 
will be various subsets of the real numbers R. 


25. 
26. 
27. 
28. 
29. 
30. 
31. 
32. 
33. 
34. 
35. 
36. 
37. 


If S$ = [0,5] and T=R, is f injective and is f surjective? 

If § = [0,5] and T = [0, 25], is f injective and is f surjective? 

If § = [—5, 5] and T = [—25, 25], is f injective and is f surjective? 

If § = [—5, 5] and T = [0, 25], is f injective and is f surjective? 

If § = [1, 2] and T = [0, 4], is f injective and is f surjective? 

If S$ = [—1, 0] U[1, 2] and T = [0, 4], is f injective and is f surjective? 

If S = (—1, OJU[1, 2] and T = [0, 4], is f injective and is f surjective? 

If $ = [—1, 2] and T = [—4, 4], is f injective and is f surjective? 

Give an example of a function f : R — R that is injective but not surjective. 

Give an example of a function g : R — R that is surjective but not injective. 

Show that if S is a finite set, then every injective function h : § — S is also surjective. 
Show that if S is a finite set, then every surjective function j : S — S is also injective. 


If T is aset and f|, fo,..., f, are injective functions from T to T, show that the 
composition fo f20---o f, is also injective. (In light of the associative law, we do not 
need to use any parentheses when composing these n functions.) 


38. 


Proof and Intuition 59 


If V is a set and g1, g2,..., Z, are surjective functions from V to V, show that the 
composition g; 0 g20---0 g, is also surjective. (In light of the associative law, we do not 
need to use any parentheses when composing these n functions.) 


In exercises 39-53, you are given a set S with binary function o. Determine if S is closed 
under o, thereby making o a binary operation. If S is not closed under 0, give an example of 
a,be Ssuchthataob¢S. 


39. 
40. 
41. 


42. 


43. 


4A, 
45. 
46. 
AT. 
48. 
49. 
50. 


51. 


52. 


53. 


Let S = {a+ bx | a, b € R} and let o be the ordinary addition of polynomials. 


Let $ = {a+ bx | a, b € R} and let o be the ordinary multiplication of polynomials. 


Let S be the set of polynomials with rational coefficients and let o be the ordinary 
multiplication of polynomials. 


Let S be the set of polynomials with rational coefficients of odd degree and let o be the 
ordinary multiplication of polynomials. 


Let S be the set of polynomials with rational coefficients of even degree and let o be the 
ordinary multiplication of polynomials. 


Let S be the set of odd integers and let o be ordinary addition. 

Let S be the set of odd integers and let o be ordinary multiplication. 

Let S be the set of nonzero real numbers and let o be ordinary multiplication. 
Let S be the set of positive real numbers and let o be ordinary multiplication. 
Let S be the set of negative real numbers and let o be ordinary addition. 

Let S be the set of negative real numbers and let o be ordinary multiplication. 


Let S be the set of polynomials with integer coefficients and let o be ordinary 
multiplication. 


Let S be the set of real numbers that are roots of the polynomial x* — 1 and let o be 
ordinary multiplication. 


Let S be the set of real numbers that are roots of the polynomial x” — 1 and let o be 
ordinary addition. 


Let S be the set of polynomials with rational coefficients of degree less than 50 and 
let o be ordinary multiplication. 


In exercises 54-59, let S, T, V be sets and let f : T — V and g: S > T be functions. 


54. 


Show that if fo g is injective, then g must be injective. 


60 Chapter 2 

55. Show that if fog is injective and g is surjective, then f must be injective. 

56. Give an example of S, 7, V, f, and g such that fo g is injective and f is not injective. 
57. Show that if fo g is surjective, then f must be surjective. 

58. Show that if fog is surjective and f is injective, then g must be surjective. 

59. Give an example of S, 7, V, f, and g such that f og is surjective and g is not surjective. 
In exercises 60-71, let S be a set with 3 elements, T a set with 4 elements, and V a set with 

5 elements. 

60. How many functions f : S — T are there? 


61. 
62. 
63. 
64. 
65. 
66. 
67. 
68. 
69. 
70. 
71. 


How many functions f : T — S are there? 
How many functions f : S — T are injective? 
How many functions f : T — S are injective? 
How many functions f : S — T are surjective? 
How many functions f : T — S are surjective? 
How many functions f : S > V are there? 
How many functions f : V > S are there? 
How many functions f : S > V are injective? 
How many functions f : V — S are injective? 
How many functions f : S > V are surjective? 


How many functions f : V — T are surjective? 


In exercises 72-74, S is a set with n > 1 elements and T is a set with m > 1 elements. 


72. 
73. 
74. 
75. 


How many functions f : S > T are there? 
If n <m, how many functions f : S — T are injective? 
If m =n+1, how many functions f : T > S are surjective? 


If S is a finite set and f € Bij(S), show that there is a positive integer ft, such that the 
composition of t copies of f, fo fo---o f, is equal to the identity map in Bij(S). 


The Integers 


Throughout this book, we will work with many number systems that you have probably seen 
before, such as the integers Z, the rational numbers Q, the real numbers R, and the complex 
numbers C. In addition, we will look at other types of number systems that may be new to you, 
such as the integers modulo a prime Z,. The first number system we will examine is the 
integers which is the infinite set 


LSA on =P 4, 3,2, —15.0;,1,25:3,.4,5, 25} 


We begin with the integers for three reasons: 


(a) Many other number systems are built up from the integers. 


(b) Other algebraic objects, such as polynomials, have important properties in common with 
the integers. 


(c) Proofs of facts about other algebraic objects will often reduce down to questions about 
the integers. 


In previous courses, you have not only added and multiplied integers but you have probably 
been introduced to concepts such as divisibility, prime numbers, greatest common divisors, 
and factorization into primes. Not only will similar concepts occur when studying sets of 
polynomials such as Q[x], R[x], C[x] but they also arise in even more abstract settings. 

By first developing a deep understanding of these concepts, as they relate to Z, we will be well 
prepared to apply them to polynomials in Chapters 9, 12, and 17. In fact, a recurring theme of 
this book is the important similarity between the algebraic structure of the integers and various 
sets of polynomials. 


3.1 Prime Numbers 


In our study of the integers, the first concept we will be introduced to is divisibility. 


Copyright © 2010 by Elsevier Inc. All rights reserved. 6 1 


62 Chapter 3 


Definition 3.1. Given integers a and b (with a € 0), we say that a divides b, written a | b, if 
there exists an integer m such that b= a-m. In this case, we also say that a is a divisor of b or 
that b is a multiple of a. 


If a is not a divisor of b, we write a { b. 


Examples 


2 (4, A+ 2, AS 0, —7 (63,5 25, —10.| —10, 13/39, 137 93; 
] 


Observe that 1 and —1 are divisors of all integers, and every nonzero integer is a divisor of 0. 
A useful fact that we’ ll use throughout this chapter is that if a and b are integers such that a | b 
and b is positive, then b > a. 


The basic building blocks of the integers are the prime numbers. A thorough understanding of 
the properties of prime numbers will be very useful in proving facts about the integers and 
other algebraic objects. 


Definition 3.2. An integer n is called prime if n > | and the only positive divisors of n are | 
and n. 


As a result, the list of prime numbers begins 
2,3,5,7, 11, 13, 17, 19, 23, 29, 31, 37,41, 43,... 
At this point, two questions you may have about prime numbers are: 


(a) The definition of prime numbers says that all primes must be greater than 1. This seems 
somewhat arbitrary. Why is 1 not considered to be a prime number? After all, its only 
positive divisor is 1. 


(b) How many prime numbers are there? In particular, is the number of primes finite or 
infinite? 


We will answer both of these questions shortly. Despite the fact that prime numbers are very 
concrete and have been studied for centuries, there are still some very basic and easily stated 
questions about them that remain unanswered after all these years. The following are two of 
these questions. 


Goldbach’s Conjecture 
Can every even integer greater than 2 be written as a sum of two prime numbers? 


For example, we have 4=2+2, 6=3+3, 8=3+4+5, 10=3+7, and 10=5+5. 


The Integers 63 


Twin Primes Conjecture 

We say that a pair of prime numbers are twin primes if they differ by 2. For example, 3 and 5 
are twin primes, as well as 17 and 19. The twin primes conjecture asks if there are an infinite 
number of pairs of twin primes. 


In some of your previous mathematics courses, you may have gotten the impression that 
everything that can be known in mathematics is known and has been known for hundreds of 
years. However, the preceding two questions are just two of the many questions being studied 
today by researchers in mathematics. At present, there are thousands of people around the 
world actively engaged in mathematical research. In fact, the last 50 years has witnessed more 
research in mathematics than any other period in our history. 


There are several interesting comments we can make about primes and twin primes that are 
related to concepts you have probably seen a a calculus course. Recall that when you studied 
infinite series, you saw that the sum )°>~ ; was infinite. Suppose instead of letting n range 
over the entire set of positive integers, we oni let n range through the set of prime numbers. 
It has also been shown that the sum 


se ptatgtotatatatotmtet™ 
nn BRS 11 13 19 23 29 


n is prime 


is infinite. On the other hand, we can look at the sum where we only let n range through the set 
of primes that are part of a pair of twin primes to obtain the new sum 


1 =atetetatatatw@tot 

n is a twin prime y 3 5 uy 13 se 29 
Even though this sum looks very similar to the previous sum, it turns out that this sum is finite. 
What does all this tell us? First, it says that the set of prime numbers is certainly infinite. This 
is a fact we will prove shortly using different ideas and techniques. Second, although it does 
not tell us whether the set of pairs of twin primes is infinite, it does say that in some sense the 
number of primes that are part of a pair of twin primes is much smaller than the number of 
primes. Admittedly, at this point, we are being very informal and imprecise in discussing the 
relative sizes of infinite sets. However, it is worthwhile to get a taste of some of the things that 
one can prove about prime numbers in a course in number theory. 


We can now begin our examination of prime factorization. Let us consider the following 
examples: 


6=2!.3!, 48=21.3! 125=57, 363 =3!-117, 360 = 2° -37-5!. 


Each of the numbers 6, 48, 125, 363, and 360 has been written as a product of prime numbers. 
Observe that the only other way we could write these numbers as a product of primes is to 


64 Chapter 3 


juggle the order of the primes. For example, 


363 = 3!.117 = 11! .3!.11' = 117-3!, 


are the only ways to write 363 as a product of primes. In light of this, we say that the 
factorization of 363 as 3! - 117 is unique up to order. The main result of this chapter will be 
that every integer greater than 1 can be written uniquely (up to order) as a product of primes. 
In order to prove this fact, we will need to develop two important pieces of mathematical 
machinery: the Division Algorithm and the Euclidean Algorithm. 


Before stating our main result on prime factorization, two comments should be made. First, 
when we use the expression “product of primes,” we allow products of length 1. By this we 
mean that the factorization of the number 5 into primes is simply as the single prime 5!. 

In other words, whereas the factorization of 15 into primes is 15 = 3!.5!, the way we factor 5 
into primes is 5 = 5'. Normally, we think of a product as requiring at least two numbers to be 
multiplied, but we will now consider a single number as a product of length 1. 


Second, we can write the number 6 as a product in the following different ways: 
Gas Si 2 ar graded or Seah 


Note that this says if we consider | to be a prime number, then in the preceding equation, we 
have provided five different ways to express 6 as a product of primes. Therefore, in order to 
have a reasonable concept of unique factorization, it is important that we do not consider 1 to 
be a prime number. 


3.2 Unique Factorization 


We can now state the main result of this chapter. 


Theorem 3.3—Unique Factorization Theorem. Every integer n > 1 can be written 
uniquely (up to order) as a product of primes. 


There are actually two parts to this theorem. First, we will need to show that every integer 
n > 1 can be written as a product of primes. That will not be a difficult task. The harder job 
will be to show that this factorization is unique (up to order). 


Intuition. In practice, how do we factor numbers into primes? The basic idea is that if a 
number is prime, then it is already factored into primes as a product of length 1, but if it is not 
prime, then it can be written as a product of two smaller numbers. We then apply this same 
procedure to these two smaller numbers and continue repeating this procedure until only prime 
numbers remain. 


The Integers 65 


Earlier we observed that 360 = 2° .37-5!. We can now apply the preceding procedure to the 
number 360 to see how this factorization was obtained. At various points in this procedure, we 
will be expressing a number as a product of two smaller numbers. The way that you choose to 
do this is entirely up to you. For example, we could begin by saying 360 = 18-20 or 

360 = 8-45 or 360 = 12-30. An important aspect of the uniqueness of prime factorization is 
that regardless of the choices you make along the way, you will always end up with the same 
factorization. 


We begin by observing that 360 is not prime, since 10 is a divisor. Therefore 
360 = 10-36. 
However, 10 is not prime, since 5 is a divisor. Therefore, 
10=2-5 andnow 360=2-5-36. 

But 36 is not prime, since 2 is a divisor. Therefore, 

36=2-18 andnow 360=2-5-2-18. 
But 18 is not prime, since 6 is a divisor. Therefore, 

18=6-3 andnow 360=2-5-2-6-3. 
But 6 is not prime, since 3 is a divisor. Therefore, 

6=3-2 andnow 360=2-5-2-3-2-3. 


Since 2, 3, and 5 are all primes, the factorization of the number 360 into primes is complete 
and, by juggling the order of the primes, we obtain 


360 = 23 .37-5!. 


You should try factoring 360 again, but this time, make different choices than the preceding 
ones. Then, at the end, juggle the order of your primes until you once again obtain the 
factorization 360 = 23 -3*-5!. As we reflect on this procedure, we can ask ourselves, how do 
we know when we are “done”? The answer is, when the only factors remaining are prime 
numbers. But how do we know that we ever reach a point where only prime numbers remain? 
This may sound very similar to the question we posed after Proposition 2.3, when we 
wondered if the procedure for reducing fractions ever came to an end. In light of this, it may 
come as no surprise to you that we will need to use the Well Ordering Principle in a proof by 
contradiction to rigorously show that this procedure does indeed always reach a point where 
only prime factors remain. 


66 Chapter 3 


Proof of the first part of Theorem 3.3—the existence of prime factorization. We proceed with 
a proof by contradiction, so we will suppose that there exists an integer n > | that cannot be 
written as a product of primes. The Well Ordering Principle now guarantees that there is a 
smallest positive integer m such that m > 1 and m cannot be written as a product of primes. 
Let us now examine the nature of m. One possibility is that m is prime, but in this case we can 
factor m into primes, as m = m'. In this case, we see that, simultaneously, m can and cannot be 
factored into primes. This is certainly a contradiction, so the case of m being a prime cannot 
occur. 


The only remaining possibility is that m is not prime, so we can write m as a product of two 
smaller positive integers. As a result, there exist integers a, b such that 


m=a-b, where |1<a<m and 1<b<m. 


Since m is the smallest integer that is bigger than 1 that cannot be written as a product of 
primes, we see that a and b can be written as a product of primes. Therefore, there exist prime 
numbers pj, P2,.--, Pk, 71> 92; ---qi Such that 


a= P\- P2-+* Pe and b=q1-92°°-4- 


Note that the list of primes pj, p2,.-.-, Pk, 71, 92;--- qi 18 allowed to have the same prime 
number occurring more than once. 


Since m = a-b, we now have 
m=a-b= py: pr: Pedi G2°**Q- 


However, the preceding equation illustrates that m can be written as a product of primes. This 
is a contradiction, since m, simultaneously, can and cannot be written as a product of primes. 
We have now shown that both of our cases, m being prime and m not being prime, lead to a 
contradiction. Therefore, there does not exist an integer n > 1 that cannot be written as a 
product of primes. We can now conclude that every integer n > 1 can indeed be written as a 
product of primes. 


Earlier we asked if there is an infinite number of primes. We will now see that the answer is 
yes and that it follows from the existence of prime factorization. 


Theorem 3.4. There is an infinite number of prime numbers. 


Intuition. Suppose you are given a list of prime numbers like 2,5, 13,29. What happens if 
we multiply all of these numbers and then add 1? In this case, we obtain 


3770 =2-5-13-29 and 3771 =37704+1=2-5-13-29+4+1. 


The Integers 67 


Observe that any number that divides 3771 and 3770 must divide 3771 — 3770 = 1. Asa 
result, the only positive integer that divides both 3771 and 3770 is 1. In particular, there is no 
prime number that divides both 3771 and 3770. In light of this, none of the primes 2, 5, 13, 29 
can divide 3771. 


The existence of prime factorization tells us that there are prime numbers that divide 3771 and 
our previous argument tells us that none of those primes can be equal to 2,5, 13, or 29. 
Therefore, any prime that divides 3771 is a prime that is not on our original list. What this 
argument really tells us is that given any finite list of prime numbers, there exists a prime 
number that is not on the list. It will not be difficult to formalize this argument to produce a 
proof of Theorem 3.4. Returning to our example, 


3771 = 37-419 


and 3 and 419 are primes not on our original list 2,5, 13, 29. 


Proof: We will proceed with a proof by contradiction, so suppose there is only a finite number 
of prime numbers. Then there is a finite list pj, p2,..., Pm Consisting of all the prime 
numbers. Next, let 


N= Pi: p2-**Pm+1. 


Thus, 7 is 1 more than the product of all the prime numbers. 


Any integer that divides both n and p,- p2--- Pm must also divide 
N— Pi: P2°** Pm =1. 


As a result, the only positive integer that divides both n and p,- p2--+ Pm is 1. In particular, 
there is no prime number that divides both n and p; - p2--- Pm. However, the existence of 
prime factorization tells us that there is some prime gq that divides n. Since p1, p2,... Pm is a 
complete list of prime numbers, g appears somewhere on the list, and that tells us that g is also 
a factor of p1- p2--- Pm. This is a contradiction, since we already know that no prime can 
divide both n and p;- p2--- Pm. 


3.3 Division Algorithm 


In order to be able to prove the uniqueness of prime factorization, we need to first turn our 
attention to division. For example, if we divide 64 by 7, we obtain 


64=9-7+1. 


68 Chapter 3 


Recall that we call 9 the quotient and | the remainder. Similarly, if we divide 101 by 8, we 
obtain 


101 = 12-8+5, 


so 12 is the quotient and 5 is the remainder. 


We need to be a little more careful when dividing negative integers. Suppose we divide —64 
by 7. We obtain 


—64=-9-7—-1 and -—-64=-10-7+6. 


Looking at the preceding equations, your first instinct might be to say that —9 is the quotient 
and —1 is the remainder. On the other hand, perhaps you feel that —10 is the quotient and 6 is 
the remainder. Which of these answers do we consider to be the “correct” answer? 


Recall that when we divided 64 by 7, we said that 9 was the quotient and 1 was the remainder. 
Suppose someone else said that since 64 = 6-7-+ 22, we should then consider 6 to be the 
quotient and 22 to be the remainder. We would probably say that we do not consider this to be 
the correct answer, as we demand that the remainder when dividing by 7 must be less than 7 
and also not be negative. For this reason, if someone else pointed out that 


64 = 10-:7—6, 


we would also not consider 10 and —6 to be the quotient and remainder, as the remainder must 
be less than 7 and also not be negative. In light of this, when we reexamine the problem of 
dividing —64 by 7, we consider —10 to be the quotient and 6 to be the remainder, since the 
remainder must be less than 7 and also not be negative. 


In the preceding examples, it is the restrictions we place on the size of the remainder that allow 
us to obtain a unique quotient and remainder when dividing by positive integers. 


Theorem 3.5—The Division Algorithm. [fa is a positive integer and n is any integer, then 
there exist unique integers q and r with the properties that 


n=q:a+r and O0<r<a. 
We call q the quotient and r the remainder. 


Intuition. To see what happens when we perform division, let us examine the special cases 
where we divide the numbers 64 and —64 by 7. When we divide 64 by 7, we can begin 
subtracting copies of 7 away from 64 to obtain the list 


64, 57,50, 43, 36, 29, 22, 15, 8, 1, —6, —13, —20,... 


The Integers 69 


The smallest positive number on this list is 1, and it was obtained by subtracting 9 copies of 7 
from 64. In other words, this tells us that 64 =9-7+1. 


On the other hand, when we divide —64 by 7, we can begin adding copies of 7 to —64 to 
obtain the list 


—64, —57, —50, —43, —36, —29, —22, —15, —8, —1, 6, 13, 20,... 


The smallest positive number on this list is 6, and it was obtained by adding 10 copies of 7 to 
—64. This tells us that —64 = —10-7+ 6. 


In both of the preceding examples, we either added or subtracted copies of 7 and found the 
smallest positive integer that occurs in this way. That is how we found the remainder. The 
quotient then refers to how many copies of 7 we needed to add or subtract to reach the 
remainder. At this point, it should come as no surprise that when writing the proof of the 
Division Algorithm, it will be the Well Ordering Principle that will guarantee that there is 
always a smallest positive integer that can be obtained by either adding or subtracting copies 
of the number we are dividing by. 


Proof. We will first show that given a positive integer a and an integer n, there exist integers g 
and r with the properties that 


n=q-a+r and O<r<a. 
We will handle the question of the uniqueness of the quotient and remainder immediately 


thereafter. 


The first possibility is that a is a divisor of n. Therefore, in this case, there exists an integer 
q such that n = q-a. As a result, g is our quotient and 0 is our remainder. Note that this will 
be the only case where the remainder can be 0. 


We are now in the case where a is not a divisor of n, and we will let 
B={n-t-a|t €Z}. 


In other words, B is the collection of all numbers that can be obtained from n by adding and 
subtracting copies of a. Observe that regardless of whether n is positive or negative, the set B 
contains some positive integers. The Well Ordering Principle guarantees that there exists a 
smallest positive integer in B, and we will denote this integer as r. In addition, since r belongs 
to the set B, we know that there exists some integer g such that 


r=n—q-a. 


70 Chapter 3 


We claim that | < r < a. To see this, observe that 
(1) r—a=n-—(qtl)-a. 


Note that if r—a = 0, then n = (q+ 1)-a, which says that a is a divisor of n. Since we are in 
the case where a is not a divisor of n, that would be a contradiction. As a result, r—a £0, 
which says that r £ a. 


Now suppose that r > a; this says that r—a > 0. But this would say that r — a is a positive 
integer that is smaller than r and, using (1), is also in the set B. This is also a contradiction. 
Since we have shown that both of the cases r = a and r > a cannot occur, it must be the case 
that r < a. However, if we combine this with the fact that we already knew r was positive, we 
have indeed shown that 1 <r < a. Therefore, in this case, we see that g and r are integers with 
the properties that 


n=q-:a+r and Il<r<a. 


As a result, we have shown that, in all cases, there exist a quotient g and a remainder r with the 
desired properties. 


To complete the proof, we need to show that the quotient g and remainder r are unique. This 
means that no matter how many times you divide n by a, you will always obtain the same 
quotient and remainder, provided that we insist that the remainder be less than a and not be 
negative. 


Suppose, on two different occasions, we divide n by a and obtain 
n=q,-at+tr, and n=q2-a+nr, 


where r; and rz are both less than a and not negative. To prove that the quotient and remainder 
are unique, we need to show that gi = g2 andr; = 12. 


Since 
Q-atry =n=qQ-atnra, 
if we subtract both qg2-a and r; from the previous equation, we obtain 
qi-a—qz-a=r2—-1"1. 
Therefore, 


(2) (q1—q2):-a=n—-n"N. 


Since r; and rz are both less than a and not negative, it follows that 


—a<12—-T| <a. 


The Integers 71 


In particular, the absolute value of r2 — 7; must be less than a. On the other hand, the absolute 
value of any nonzero multiple of a must be greater than or equal to a. Therefore, using (2), we 
see that (gq; — g2)-ais a multiple of a whose absolute value is less than a. As a result, 


(3) (41 — 42) :a=0. 


Since a £ 0, it follows immediately from (3) that g; — g2 = 0. Thus, gq; = gz. Returning to (2), 
since g1 = q2, we see that r2 — r; = 0. Therefore, r; = rp. 


Exercises for Sections 3.1, 3.2, and 3.3 


For the exercises from these sections, we need to briefly introduce two pieces of terminology 
and notation. 


(a) Any positive integer n such that n > 2 and n is not prime is called composite. As a result, 
every integer n > 2 is either prime or composite but not both. The list of composites 
begins 


4,6, 8,9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 25, 26, 27, 28,... 


(b) IfneéN, we let n! = (1)- (2)-(3)---(n—1)- (n), and we refer to n! as n factorial. 
Observe that ifn € N, then (n+ 1)! = (1+ 1)-n!. As a result, we have 


I=1, 2!=2, 3!=6, 44=24, 5!=120, 6!=720, 7!=5040,... 


For convenience, we also let 0! = 1. 


1. Immediately after Definition 3.2, we listed all the primes less than 44. Continue the list 
so it includes all prime numbers less than 100. 


2. Use your list from exercise 1 to show that every even integer n, where 4 <n < 40, can be 
written as a sum of two prime numbers. By doing so, you will have verified all cases of 
Goldbach’s Conjecture up to 40. 


3. Use your list from exercise | to find all pairs of twin primes where the primes are less 
than 100. 


4. (a) Ifnis composite, show that there exists a prime number p that divides n and is less 
than or equal to /n. 


(b) Use part (a) to show that in order to check whether a positive integer m > 2 is 
prime, you only need to confirm that it fails to be divisible by all primes less than or 


equal to ,/m. 


72 


10. 


11. 


12. 


13. 


14. 


Chapter 3 

(c) In light of part (b), how would you go about determining whether the numbers 323, 
353, and 371 are prime? 

(d) Which of 323, 353, and 371 are prime? How do you know? 

Find the prime factorization of each of the following integers: 480, 850, 7623. 

Find the prime factorization of each of the following integers: 378, 2205, 7007. 


Shortly after the statement of Theorem 3.3, we found the prime factorization of 360 
by starting with the observation 360 = 10-36. Show that if we begin with the observation 
360 = 18 - 20, then, up to order, we obtain the same prime factorization of 360. 


As in exercise 7, show that the observations 360 = 8-45 and 360 = 12-30 both lead 
to the same factorization of 360, up to order, that we found after the statement of 
Theorem 3.3. 


(a) What is the prime factorization of 13!? 
(b) What is the prime factorization of 27!? 
(a) What is the prime factorization of 20!? 
(b) What is the prime factorization of 41!? 


In this exercise, you may want to refer to your answers in exercise 9. 
(a) When computing the exact value of 13!, how many zeroes appear at the far 
right? 


(b) When computing the exact value of 27!, how many zeroes appear at the far right? 
(a) When computing the exact value of 1000!, how many zeroes appear at the far right? 


(b) When computing the exact value of 200,000!, how many zeroes appear at the far 
right? 
(c) When computing the exact value of 5,000,000!, how many zeroes appear at the far 


right? 


(a) Find the smallest positive integer n such that the exact value of n! has exactly 1000 
zeroes appearing at the far right. (You will probably need some trial and error to 
solve this.) 


(b) Use your answer to part (a) to find all positive integers ¢ such that the exact value of 
t! has exactly 1000 zeroes appearing at the far right. 


Write a proof of the first part of Theorem 3.3 that uses the Second Version of 
Mathematical Induction and does not use a proof by contradiction. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


The Integers 73 
Suppose today is Monday. 
(a) What day of the week will it be 100 days from now? 
(b) What day of the week will it be 10,000 days from now? 
(c) What day of the week will it be 1,000,000 days from now? 


An oval track has a circumference of 440 yards. Therefore, after running 440 yards, you 
are back at the starting point. 
(a) How far from the starting point are you after running 1395 yards? 


(b) How far from the starting point are you after running 4963 yards? 


(c) After running 3440 yards, how far do you need to run to get back to the starting 
point? 


Leta =2-3-5-74+11,b=2-3-5+7-11, andc=3-5-11+2-7. Without computing 
the exact values of a, b, c, explain why none of these three numbers is divisible by a 
prime less than 13. 


Compute the numerical values of a, b, c from exercise 17 and find the prime factorization 
of all three numbers. 


(a) Use your list from exercise 1 to find all examples of three consecutive odd integers 
that are less than 100 and are all prime numbers. 


(b) Show that if m is an integer then m,m-+2,m-+4 all have different remainders 
when divided by 3. 


(c) Use part (b) to show that if n is a positive integer, then exactly one of n,n+2,n+4 
is divisible by 3. 


(d) Use part (c) to show that the example you found in part (a) is the only time that 
three consecutive odd integers are all prime. 


(a) Show that the sum of any three consecutive integers must be divisible by 3. 
(b) Show that the sum of any five consecutive integers must be divisible by 5. 


Generalize your results from exercise 20 to show that if n is a positive odd integer, then 
the sum of any n consecutive integers must be divisible by n. 


(a) Explain why the following ten integers 11!+ 2, 11!+3, 11!+4,..., 11!4+ 10, 
11!+ 11 are all composite. 


(b) Explain why the following 50 integers 51!+2,51!+3,51!+4,...,51!+50, 
51!+51 are all composite. 


74 Chapter 3 


(c) Show that there exist 1,000,000 consecutive positive integers that are all composite. 


23. Generalize your results from exercise 22, and show that for any positive integer n, there 
exist n consecutive positive integers that are all composite. 


For exercises 24—29, please first read the following: 


In exercises 22-23, we showed that there can exist long gaps between prime numbers. This is 
quite interesting when contrasted with the fact that there exists an infinite number of primes. 
In the next six exercises, we will see some additional proofs that there is an infinite number 
of primes and will also prove that various subsets of N contain an infinite number of 

primes. 


24. (a) Ifn€N, show that any prime number that divides n!+ 1 must be larger than n. 


(b) Use part (a) to show that for any integer n, there exists a prime number p such that 
p>n. 


(c) Explain how parts (a) and (b) can be used to give another proof that there is an 
infinite number of primes. 


25. Explain how your work in exercise 24 actually provides a proof of the following 
statement: For any n € N, there exists a prime number p such thatn < p<n!+1. 


26. (a) Show that ifn is an odd integer, then there exists an integer k such that either 
n=4k+1orn=4k+3. 


(b) Use part (a) to show that if n is an odd integer, then there is an integer / such that 
n? =41+1. 


27. (a) Letk,,k2,...,k» be integers, and then let 


t= (4k, + I)(4ko $1) kin +1. 


Show that there exists an integer s such that f= 45+ 1. 


(b) Suppose n is an integer of the form n = 4k +3, for some integer k. Use part (c) to 
show there is a prime p that divides n such that p = 4w-+ 3, for some integer w. 


28. We know that there is an infinite number of odd prime numbers and that in light of 
exercise 26(a), each odd prime number is either of the form 4k + 1 or 4k +3, for some 
integer k. However, at this point, we do not know how many of these odd primes are of 
the form 44+ 1 and how many are of the form 4k + 3. In this exercise, we will apply 
various portions of exercises 26 and 27 to show that there is indeed an infinite number 
of primes of the form 4k + 3. To this end, let v > 3 be an integer, and let pi, p2,..., Pm 


The Integers 75 


be a complete list of the odd primes that are less than or equal to v and then let 
N= Pi\: P2°** Pm- 
(a) Show that there is an integer / such that n?+2=4143. 


(b) By looking at those primes that divide n? + 2, show that there exists a prime p such 
that p > vand p= 4s-+3, for some integer s. 


(c) Use part (b) to show that there is an infinite number of primes of the form 4k + 3. 


29. You can easily use the division algorithm to check that every odd prime number other 
than 3 is either of the form 6k + 1 or 6k +5, for some integer k. However, at this point, 
we do not know how many of the infinite number of odd primes are of the form 6k + 1 
and how many are of the form 6k +5. In this exercise, we will show that there is an 
infinite number of primes of the form 6k + 5. With some modifications, we will follow 
the plan set out in exercise 28. Let v be a positive integer and let p, p2,..., Pm be all 
the odd prime numbers that are more than 3 and less than or equal to v. Then let 
n= P\° P2°** Pm- 
(a) Show that there is an integer / such that n-+4=614+5. 


(b) By looking at those primes that divide n? + 4, show that there exists a prime p such 
that p > v and p= 6s+5, for some integer s. 


(c) Use part (b) to show that there is an infinite number of primes of the form 6k +5. 


30. For this exercise, we need to recall two facts on infinite series. The first is that )°~_, i is 


infinite. The second is that if m > 1, then }°>°) 4 = 20 


(a) If {pi, po,..., pr} is a finite set of prime numbers and if 


(Sm) (Sa) Ga) 


show that B must be finite. 


(b) Show that B is equal to the sum of the reciprocals of all positive integers whose 
prime divisors are a subset of {p1, p2,..., pi}. 


(c) Show that part (b) can be used to provide another proof of the fact that the number 
of primes is infinite. 


For exercises 31—37, please read the following: 


When trying to understand a new mathematical concept or property, it is desirable to look at 
some examples that have this property as well as some examples that do not. For example, 
when learning about continuity, things make much more sense when we contrast graphs of 


76 Chapter 3 


continuous functions with graphs of functions that are not continuous. Similarly, to understand 
limits, we should contrast points where limits exist with points where they do not exist. 


Students sometimes have trouble understanding unique factorization, or the need to prove the 
uniqueness of prime factorization, because such a contrast does not seem to exist. We cannot 
contrast examples of integers greater than | that factor uniquely into primes with integers 

that do not, as there are no integers greater than 1 that do not factor uniquely. It is difficult to 
internalize a concept or property without also looking at objects that do not possess this 
property. Therefore, students sometimes have difficulty imagining how something could fail to 
factor uniquely. Motivated by this problem, we present an example of a set that is very similar 
to the set of integers greater than 1, where unique factorization does not hold. 


We let E denote the set of positive even integers. Thus, 


E = (2, 4,6, 8, 10, 12,...}. 


We say that an element n € E is E-prime is n cannot be written as the product of smaller 
elements of E. The E-prime elements of E will play a role in E analogous to the role played by 
ordinary primes in N. In particular, in the following exercises, we will show that every element 
of E can be written as a product of E-primes. However, we will also show that factorization 
into E-primes is not unique. Hopefully, this example will help put the concept of unique 
factorization into primes in N into better perspective. 


31. Show that every element of E can be written as a product of E-primes. (Remember, when 
necessary, we allow products of length 1.) 


32. Show that 2, 6, 18 are all E-primes. 


33. Show that 4, 8, 20 are not E-primes. 


34. Generalize exercises 32 and 33 and show that the E-primes are precisely those positive 
even integers that are not divisible by 4 in N. 


+] 


35. Show that 36 can be factored more than one way, up to order, into E-primes in 


kb 


+] 


36. Show that 60 can be factored more than one way, up to order, into E-primes in 


kb 


37. Determine a simple criteria for when elements of E factor uniquely into 
E-primes in E. 


3.4 Greatest Common Divisors 


The next tool we will need to prove the uniqueness of prime factorization is greatest common 
divisors. 


The Integers 77 


Definition 3.6. If aand b are nonzero integers, then we let gcd(a, b) denote the greatest 
common divisor of a and b, which is the largest positive integer that is a divisor of both a and 
b. More generally, if a,, a2, ..., Qn are integers, with at least one of them nonzero, then 
gcd(aj, a2,...,,) is the largest positive integer that is a divisor of every one of the a;’s. 


Examples 


gcd(36, 10) = 2, gcd(10, 36) =2, gcd(36, —10) =2, 
gcd(27, 16) = 1, gcd(1000, 144) = 8, gcd(42, 57, 30) =3, gcd(6, 10, 15) = 1 


In light of the preceding definition, there are several questions you may have: 


(a) Given two nonzero integers, is there any guarantee that they actually have a greatest 
common divisor? 


(b) How do we compute greatest common divisors? 
(c) How does this concept relate to the uniqueness of prime factorization? 


To illustrate what is happening, let us examine gcd(3476, —948). The first observation we can 
make is that the numbers —948 and 948 have precisely the same divisors. As a result, 
gcd(3476, —948) = gcd(3476, 948). Therefore, when computing greatest common divisors, 
we can replace each integer by its absolute value without changing the final answer. 


The number 1 is certainly a divisor of both 3476 and 948, so 3476 and 948 have a common 
divisor. However, do they have a greatest common divisor? Observe that 948 is the largest 
divisor of 948. Therefore, no common divisor of 3476 and 948 can possibly exceed 948. Now 
consider the finite set {1,2,3,...,947, 948}. Every positive divisor of 948 belongs to this set. 
One by one, we can check the 948 elements of this set to see if any of them, other than 1, are 
divisors of both 3476 and 948. The largest number in {1, 2,3, ..., 947, 948} that divides both 
3476 and 948 must certainly be the greatest common divisor of 3476 and 948. It turns out that 
316 is the largest divisor of both 3476 and 948, so gcd(3476, —948) = gcd(3476, 948) = 316. 


There was nothing special about the numbers 3476 and 948. The same reasoning just used 
shows that any two nonzero integers have a greatest common divisor. In fact, our reasoning 
provides an algorithm for finding the greatest common divisor. Given two positive integers a 
and b, we simply check every integer from 1 up to the smaller of a and b to see which of them 
are divisors of both a and b. The largest number we check that turns out to be a divisor of both 
a and b is indeed the greatest common divisor of a and b. Although this algorithm is 
particularly tedious and inefficient, it does tell us that greatest common divisors exist and that 
there is at least one straightforward way to compute them. 


78 Chapter 3 


If you were introduced to greatest common divisors in elementary or high school, you may 
have seen a different algorithm for computing them that uses prime factorization. Suppose you 
are given two positive integers that are already factored into primes. For example, 

consider 


106480 = 2*.5'.117. and 6776=2°-7'-117. 


The prime numbers that appear in the prime factorization of both 106480 and 6776 are the 
only possible primes that can appear in the factorization of gcd(106480, 6776). However, to 
find the exponent of a prime in the factorization of gcd(106480, 6776), we compare its 
exponents in the factorization of both 106480 and 6776 and choose the smaller one. 
Therefore, the only primes that appear in the factorization of gcd(106480, 6776) are 2 and 11. 
Furthermore, the smallest exponent of 2 that appears in the factorization of 106480 and 6776 
is 3, and the smallest exponent of 11 that appears in the factorization of 106480 and 6776 is 2. 
As a result, gcd(106480, 6776) = 23 - 117 = 968. 


At first glance, the preceding procedure appears to be a fairly quick and easy way to compute 
greatest common divisors. However, for both theoretical and practical reasons, the preceding 
procedure is not the best for computing greatest common divisors. When we used it to 
compute gcd(106480, 6776), we used the prime factorization of 106480 and 6776. However, 
at this point, we haven’t yet proven that factorization into primes is unique. If it were 
somehow possible to factor 106480 and 6776 in different ways, that could lead to different 
answers when computing gcd(106480, 6776). Therefore, until we prove the uniqueness of 
prime factorization, it is not really valid to use this procedure to compute greatest common 
divisors. 


It turns out that if you are given a large number, it is extremely difficult to write it as a product 
of primes. If you are given a 50-digit number, although you know that it can be written as a 
product of primes, it may take many hours of computing time to figure out what its prime 
factorization looks like. Therefore, as a practical matter, using prime factorization to find 
greatest common divisors is a very slow and inefficient algorithm. At this point, we can 
introduce an algorithm for computing greatest common divisors that is not only much more 
efficient (when dealing with large numbers) but also has the added benefit that we will be able 
to use it to prove the uniqueness of prime factorization. Let us begin by making some 
observations. 


gcd(36,10)=2 and 2=2-36+(-7)-10, 
gcd(15,28)=1 and 1=(—13)-15+7-28, 
gcd(1000, 144)=8 and 8= (—1)-1000+7- 144, 
gcd(—36,10)=2 and 2=(-—2)-(—36)+(—7)-10. 


The Integers 79 


Let’s think about what this means. First we observed that 2 was the greatest common divisor 
of 36 and 10, and we found integers m and n such that 


2=m-36+n-10. 


Next we observed that 1 was the greatest common divisor of 15 and 28, and we found integers 
m and n such that 


l=m-15+n-28. 
Note that we also did the same for gcd(1000, 144) and gcd(—36, 10). 


To formalize what we just did, given integers a and b, we say that an integral combination of 
a and b is any number that can be written in the form 


m-a+n-b, 


where m and n are also integers. Therefore, we see that gcd(36, 10) can be written as an 
integral combination of 36 and 10. Similarly, gcd(15, 28) can be written as an integral 
combination of 15 and 28. It will be very useful for us to show that gcd(a, b) can always be 
written as an integral combination of a and b. To do this we will need a short lemma. 


Lemma 3.7. Let a,b,m,n and c be integers such that c is a divisor of a and b. Then c is also 
a divisor of m-a+n-b. 


Proof. Since c is a divisor of both a and b, there exist integers s and f such that a= s-c and 
b=t-c. We now have 


m-a+tn-b=m-(s-c)+n-(t-c)=(m-s)-c+(n-t)-c=(m-s+n-t)-c. 


As aresult, c is a divisor of m-a-+n-b. 


3.5 Euclidean Algorithm 


A special case of Lemma 3.7, which we will refer to quite often, is that if a, b, g, and r are 
integers such that b= q-a-+-r, and if c is a divisor of both a and r, then c is also a divisor of b. 
We can now introduce the Euclidean Algorithm for finding greatest common divisors. It will 
be best to begin with some examples. 


Example 


Let’s try to compute gcd(81, 24). If we apply the division algorithm to 81 and 24, we 
obtain 


81=3-2449, 


80 Chapter 3 


Now, apply the division algorithm to 24 and 9 to obtain 
24=2-9+6. 

Next, apply the division algorithm to 9 and 6 to obtain 
9=1-6+3. 

Finally, apply the division algorithm to 6 and 3 to obtain 


6=2-3+0. 


We claim that this collection of calculations can be used to show that gcd(81, 24) = 3 and that 
3 can be written as an integral combination of 81 and 24. First, observe that our last equation 
tells us that 3 is a divisor of 6. Moving up by one on our list of equations, since 3 is a divisor of 
6 and 3, Lemma 3.7 tells us that 3 is also a divisor of 9. Again moving up by one on our list of 
equations, since 3 is a divisor of 9 and 6, Lemma 3.7 now tells us that 3 is a divisor of 24. 
Finally, moving to the top equation, since 3 is a divisor of 24 and 9, Lemma 3.7 tells us that 3 
is a divisor of 81. Therefore, 3 is certainly a common divisor of 81 and 24, but we haven’t yet 
shown that it is the greatest common divisor. 


Let us go back to our next to last equation and rewrite it as 
3=9-1-6. 
Moving up an equation, we can replace 6 by 24 — 2-9 to obtain 
3=9-1-6=9-—1-(24—2-9) =—1-2443-9. 
Again moving up an equation, we can replace 9 by 81 — 3 - 24 to obtain 
3=—-1-244+3-9=—1-24+3.-(81—3-24) =3-81+(—10)-24. 


As aresult, we have expressed 3 as an integral combination of 81 and 24. Now suppose 

that d is a positive number that is a divisor of both 81 and 24. Then, Lemma 3.7 tells us that d 
must also be a divisor of any integral combination of 81 and 24. In particular, d must be a 
divisor of 


3=3-81+4(—10)-24. 


Since d is a divisor of 3, d cannot be greater than 3. Therefore, 3 is greater than or equal 
to any divisor of both 81 and 24. Hence, 3 is indeed the greatest common divisor of 81 
and 24. 


The Integers 81 


Let us work through another example before formalizing things. 


m Example 
To compute gcd(166, 75), we begin by dividing 166 by 75 to obtain 
166 = 2-75+ 16. 
Now, divide 75 by 16 to obtain 
75=4-16+11. 
Next, divide 16 by 11 to obtain 
16 1-11---5. 
Next, divide 11 by 5 to obtain 
Il=2-5--1, 
Finally, divide 5 by | to obtain 
S=3-1-0, 
i i 


We can now crawl upward through our list of equations to show that | can be written as an 
integral combination of 166 and 75. The next to last equation can be rewritten as 


1=11-2-5. 
Moving up an equation and replacing 5 by 16 — 1-11, we obtain 
1=11-—2-5=11—2(16—-1-11) = (—2)-164+3-11. 
Moving up an equation, we can replace 11 by 75 — 4-16 to obtain 
1= (—2)-1643-11 = (—2)-16+3-(75 —4- 16) =3-75+(—14)- 16. 
Finally, we move to the top equation and replace 16 by 166 —2-75 to obtain 
1=3-75+ (—14)-16=3.-75+ (—14) - (166 — 2-75) = (—14)- 166+31-75. 


Having expressed | as an integral combination of 166 and 75, we see that any common divisor 
of 166 and 75 must also be a divisor of 1. Since 1 is certainly a common divisor of 166 and 75, 
we see that 1 = gcd(166, 75). 


www.elsevierdirect.com 


82 Chapter 3 


In the preceding examples, we have used the Euclidean Algorithm to express gcd(a, b) as an 
integral combination of a and b. In our examples, a and b were both positive. We will now see 
what to do if a or b is negative. 


m Example 

Let’s look at gcd(—166, 75), gcd(166, —75), and gcd(—166, —75). Using the fact that 
1 = gcd(166, 75) = (—14)- 166+31-75, 

we can multiply in the appropriate places by —1 to obtain 


1 = gcd(—166, 75) = 14- (—166) + 31-75, 
1 = gcd(166, —75) = (—14) - 166+ (—31) - (—75), 
1 = gcd(—166, —75) = 14- (—166) + (—31)- (—75). 


In light of these observations, we can write gcd(a, b) as an integral combination of a and b 
regardless of whether a and b are positive or negative. We can now formalize all of our 
observations. 


Theorem 3.8. If a and b are nonzero integers, then gcd(a, b) is the smallest positive integer 
that can be written as an integral combination of a and b. 


Proof. The proof consists of describing and analyzing the Euclidean Algorithm in a more 
formal way. If at any point you find this formal argument hard to follow, you should go back 
and compare it to the examples we worked through earlier. By the preceding observations, we 
may assume that a and b are both positive. We begin by applying the division algorithm to a 
and b to obtain integers g; and r; such that 


b=q a+r, and 0<7r <a. 
Next, divide a by r; to obtain integers gz, rz such that 

a=qo.:r+r. and 0<m <7. 
Then divide r; by rz to obtain integers q3, r3 such that 

re=q-ratr3 and O0<73 <1. 


Observe that every time we apply the division algorithm, we obtain a smaller remainder than 
in the previous step. In particular, a > r) > r2 > 73. Therefore, if we continue this process of 


The Integers 83 


dividing remainder r; by the next remainder rj+1, we will eventually obtain a remainder of 0. 
Let us now suppose that n is the positive integer such that r,, is the /ast remainder that is not 
zero. This says that if we continue to apply this procedure, we will eventually obtain the 
equations 


Vn—2 =4n-Tn-1 +Tn and Mn—1 = Gnti:™m +0, 
where 


A> >2>73 >°+>Mm-2>Mm-1>Mm > 0. 


The last equation tells us that 7, is a divisor 7,1. Applying Lemma 3.7 to the next to last 
equation tells us that r, is also a divisor of r;_2. We can continue to move upward through our 
list of equations, and if we apply Lemma 3.7 at every step, we see that 


TnlTn—1s Tal’n—2, Tal’n—35-+-» Malo, Malti, tla, Mlb. 
Therefore, r,, is acommon divisor of a and b. 


The next to last equation shows that 7, can be written as an integral combination of r,_2 and 
rn—1. Moving up to the next equation enables us to replace r,_; by an integral combination of 
rn—3 and rn—2, which shows that r, can be written as an integral combination of r,_3 and r,_2. 
Continuing in this way, we see that eventually r, can be written as an integral combination of 
a and r,, and, finally, 7, can be written as an integral combination of b and a. 


If d is any common divisor of a and b, then Lemma 3.7 asserts that d is a divisor of any 
integral combination of a and b. Therefore, d is also a divisor of r;,, so r, must be greater than 
or equal to d. Since we have shown that r, is both a common divisor of a and b and is also 
greater than or equal to any common divisor of a and b, we see that r, = gcd(a, b). 


Having succeeded in showing that gcd(a, b) can be written as an integral combination of a and 
b, we now need to show that gcd(a, b) is the smallest positive integer that can be written this 
way. To this end, suppose f is a positive integer that can be also written as an integral 
combination of a and b. We need to show that gcd(a, b) is less than or equal to f. 


Since gcd(a, b) is acommon divisor of a and b, Lemma 3.7 implies that gcd(a, b) is also a 
divisor of f. However, since gcd(a, b) is a divisor of f, we see that gcd(a, b) is less than or 
equal to f. Thus, gcd(a, b) is indeed the smallest positive integer that can be written as an 
integral combination of a and b. 


The Euclidean Algorithm is a fast and easy algorithm for computing greatest common 
divisors. It can easily be programmed into a graphing calculator and is far faster than 
algorithms that involve finding the prime factorization of a number. Not only will we use it to 


84 Chapter 3 


prove the uniqueness of prime factorization, but in Chapter 12 we will use a modified version 
of it to better understand polynomials. 


If a and b are integers such that gcd(a, b) = 1, we say that a and D are relatively prime. Two 
easy, but important, observations are 


(a) If a,b are nonzero integers, then a and b are relatively prime if and only if there is no 
prime number that is a factor of both a and b. 


(b) If p and q are prime numbers, then either they are equal or they are relatively prime. 


The following somewhat technical lemma will turn out to be extremely useful in proving the 
uniqueness of prime factorization. 


Lemma 3.9. Ifa, b,n are nonzero integers, where n|(a-b) and gcd(a,n) = 1, then n|b. 


Intuition. Consider the case where n = 12, a = 8, b = 18. In this case, 12 { 8 and 12 | 18, yet 
12 | (8-18). Therefore, in general, it is possible for an integer n to divide the product a- b even 
though it divides neither a nor b. However, the point of Lemma 3.9 is that if n divides the 
product a- b and if n is relatively prime to a, then n must divide b. The key to the proof will 
be that we can write 1 as an integral combination of a and n. Note that in our example, 
gcd(a,n) = gcd(8, 12) = 4 and gcd(b, n) = gcd(18, 12) = 6. Thus, n is relatively prime to 
neither a nor to b. 


Proof. Since a and n are relatively prime, we can write 1 as an integral combination of a and 
n. Therefore, there exist integers r and s such that 


l=r-a+s-n. 
Multiplying this equation by b results in 
b=b-(r-a)+b-(s-n)=r-(a-b)+(b-s)-n. 


Therefore, b is an integral combination of a-b and n. Since n is a divisor of both a-b and n, 
Lemma 3.7 implies that 7 | b. 


We can now record an important consequence of Lemma 3.9. 


Corollary 3.10. Let p, 41, 92, 93, ---; Gn be prime numbers (which are not necessarily 
distinct). If p|(q1-42°493°+*n), then p is equal to one of the q;’s. 


Intuition. Let us consider the special case where q,, q2 are prime numbers and the prime 5 
divides the product qi -qg2. We want to show that either g; = 5 or gz = 5. There are two 
possibilities, either gz = 5 or gz #5. In the first case, we are done. In the second case, 5 and q2 


The Integers 85 


are relatively prime. But then Lemma 3.9 implies that 5 divides q;. However, since q is 
prime, this immediately tells us that g; =5. 


In the preceding argument, there was nothing special about the prime 5. So the argument really 
tells us that if a prime p divides a product q: - q2 of primes, then p is equal to qi or q2. The 
general result works for a product of n primes, not just two primes, and will require 
Mathematical Induction. It turns out that this result is a relatively easy consequence of 
Lemma 3.9 but would be quite difficult to prove without Lemma 3.9. This explains the 
importance of Lemma 3.9 in the proof of the uniqueness of prime factorization. 


Proof. We let T be the set of positive integers n such that whenever a prime p divides the 
product q1 -q2-q3---Qn of n primes, then p is equal to one of the g;’s. We need to show that 
T =N, and we will proceed by using Mathematical Induction. First we need to show that T 
contains 1. So let us consider the case where p and q; are primes such that p|q,. Note that in 
this case we are considering q to be a product of primes of length 1. Since p|qi, p > 1, and qi 
is prime, it immediately follows that p = q;. Therefore, 1 € T. 


Next we consider the case where T contains some positive integer k. We need to show that T 
also contains k + 1. Therefore, suppose that we are now in the situation where p, q1, 92, 43,---, 
dk> Wk+1 are prime numbers such that 


(4) P\(Q1-92°93°** 4k Gk+1): 
We need to show that p is equal to one of the q;’s. 


There are two possibilities: either p = qx41 or p # qx+1. In the first case, we are done. In the 
second case, let 


b=41:92°43°** dks 


then (4) becomes p|(b- qx+1). However, in this second case, p and gx+1 are relatively prime. 
Therefore, we can apply Lemma 3.9 to assert that p | b. But b is a product of k primes and T 
contains k. Therefore, p is indeed equal to one of the q;’s that appear in b, and we are also 
done in this case. 


We now have all the mathematical machinery needed to prove the uniqueness of prime 
factorization. Before writing the formal proof, let us work through an example that should help 
us understand the proof. 


Example 


How does Corollary 3.10 help us study prime factorization? Consider the number 85; we 
know it can be factored into primes as 85 = 5!-17!. Let us see why this is the only way 


86 Chapter 3 


that 85 can be factored into primes. To this end, suppose q,..., Gn are prime numbers 
such that 


5!.17! =85 =q1-q2-*-dn- 


Since 5|85, Corollary 3.10 asserts that 5 must be equal to one of the q;’s. By reordering 
the gi’s we may assume that q; = 5. Dividing both sides of the preceding equation by 5, 
we obtain 


17? 5 q2°°°dn- 


Since 17 divides the left-hand side of the preceding equation, 17 must divide q2--- qn. 
However, Corollary 3.10 now asserts that 17 must equal one of the remaining qj;’s. By 
reordering the remaining g;’s, we may assume that q2 = 17. Dividing both sides of the 
previous equation by 17 results in 


85 


P= Sr ygt 8° Gee 


Since all prime numbers exceed 1, no product of primes can equal 1. Therefore, it is 
impossible to have the product q3-+- gq, equal to 1. As a result, there do not exist primes 
43.94; -++5n in the factorization of 85. Therefore, 85 is the product of only two primes, 
qi and gz. However, we showed that after reordering, gj =5 and gz = 17. As a result, up 
to order, 5!-17! is the only way to write 85 as a product of primes. 


Observe that in the preceding argument, we used that fact that any product of primes is greater 
than 1. This illustrates the usefulness of not considering 1 to be prime. 


Oftentimes, the notation we use can simplify the writing of a proof. In the following proof, 
when we write an integer as a product of primes, we will write the primes in nondecreasing 
order with all exponents equal to one. What does this mean? Using this notation, we write 12 
as 12 = 2-2-3. Similarly, we write 720 as 720 = 2-2-2-2-3-3-5. More generally, in this 
notation, if we factor n into primes as 


n= Pi: P2° P3°** Pm 


then 


Pi < p2< p3<--++< Pm. 


The advantage of this notation is that we have fixed the order in which the primes can occur, 
and we will not need to worry about reordering them during the proof. 


www-.elsevierdirect.com 


The Integers 87 


Proof of the final part of Theorem 3.3—uniqueness of prime factorization. We will proceed 
using a slightly modified form of the Second Version of Mathematical Induction. More 
precisely, we will let 7 denote those integers that are at least 2 and can be factored uniquely 
into primes. Our goal is to show that T contains all integers that are at least 2. Mathematical 
Induction asserts that it will be enough for us to show that T contains 2 and that whenever T 
contains the set of numbers {2, 3,..., k}, then it also contains the number k + 1. 


The number 2 can certainly be factored into primes as 2 = 2!. Since 2 is the smallest prime 
number, any other product of primes would exceed 2. Therefore, 2 = 2! is the only way to 
factor 2 into a product of primes, and we see that T contains 2. 


Now suppose that T contains the set of numbers {2, 3, ...,k}; we need to show that T contains 
k +1. Therefore, we may assume that every integer that is less than k+ 1 and larger than 1 can 
be factored uniquely into primes and our job is to show that k + 1 can also be factored 
uniquely into primes. In order to do this, we need to show that any two factorizations of k+ 1 
into primes are the same. To this end, suppose 


are two factorizations of k + 1 into primes using the special ordering of primes we discussed 
earlier. To show that these two factorizations of k +1 are the same, we need to show two 
things. First, we must show that the number of primes in each factorization is the same, which 
means that we need to show that n = m. Then we will need to show that the same exact primes 
occur in each factorization, which means that we need to show that p; = q;, for every i <n. 


Since p; and q, are both divisors of k +1, we have 
qil(Pi:P2°** Pn) and py|(qi-q2-++m)- 


Applying Corollary 3.10, we see that g; = p;, for some i <n, and p; = q;, for some j < m. In 
light of the special ordering we are using, we know that p; < p; and q; < q;. Therefore, 


PSPHNA=GQ=Pi. 


As aresult, p; = q1. 


Since 
k+1 a k+1 
P1 a q1 
we can now consider the factorization of a We now have 
k+1 k+1 
(5) = p2°**Pn and =42°**m- 


P1 P\ 


88 Chapter 3 


There are two cases to consider; either 


k+1 k+1 
=1 or 
P1 P1 


> 1. 


In the first case, there are no primes other than p; = gq, in the prime factorization of k+ 1; 
otherwise, the products p2--- py and g2---gm would exceed 1. As a result, n =m = 1. When 
we combine this with the fact that p; = q,, we see that our two factorizations of k + 1, 


k+1l=p, and k+l=q 


are the same. Therefore, in this case, there is only one way to factor k+ 1, and so T contains 
k+1. 


The final case to consider is when ct > 1. However, in this case oy is less than k+ 1 and 


it et 


larger than 1. Thus, T contains . Therefore the factorization of into primes must be 


k+l must be the 


unique. This tells us that the seabed of primes in our two aMioavaniois of 
same. Going back to equation (5), we see that n — 1 = m — 1, which immediately tells us that 
n =m. The next thing we know after looking back at equation (5) is that the same exact 


primes appear in each factorization of os Therefore, 


P2= 42, P3 = 43,-+++> Pn = 4n- 


Since we already knew that p; = qi, we now see that our two factorizations of k+ 1 are the 
same. Therefore, in both cases, T contains k+ 1. 


Theorem 3.3 has some very nice applications to the values of familiar functions like x!/2, x!/>, 


and log,,(x). To prove these applications, we first need to prove a fact which is a direct 
consequence of Theorem 3.3. 


Lemma 3.11. Jfa and n are positive integers, then the primes that are divisors of a" are 
precisely the same primes that are divisors of a. 


Intuition. Let us consider some examples: 
36=2° <3" and 36° S90"... 
245=5!.77 and 245° —5°.716, 
99=3*.11' and 99°) 31% 4100! 


Based on these examples, Lemma 3.11 certainly seems to be true. Its proof will follow easily 
from Theorem 3.3. 


The Integers 89 


Proof: If a= 1, then a” = a = 1, and there are no primes that divide either a or a”. On the 
other hand, if a > 1, then we can uniquely factor a into primes as 


t t t 
A= py": pr? +++ Dm”. 
Therefore, 


nt} N-tm 


a" = (py" po! pm)" = pr + po «+ Din 


When we compare the prime factorizations of a and a”, we can see that the same primes 
appear in both. The only difference is that the exponent of each prime in the factorization of a” 
is n times greater than in the factorization of a. 


In Proposition 2.4, we showed that every number on the list 


V1, V2, 73, 74, V5, ...,/99, 100, 101, ... 


is either an integer or is irrational. The proof we gave was not very intuitive and used only the 
Well Ordering Principle. Now consider the following lists. 


V1, V2, V3, V4, V5,..., 799, 100, W101, ... 
V1, 72, 73, 74, /5,...,/99, 1100, V101, ... 
V1, 72, 73, 74, V5,..., 99, 1100, 101, ... 
V1, 12,73, 74, 75, ..., 99, 100, 101, ... 


As an application of the tools developed in this chapter, we will now show that every number 


on these lists either is an integer or is irrational. The proof will be much simpler and more 
intuitive than the proof of Proposition 2.4. 


Corollary 3.12. [fa and n are positive integers, then a'/" is either an integer or is 
irrational. 


Proof. Suppose a'/” is rational; it will suffice to show that it must be an integer. To this end, if 
a'/" is rational, then we can write it as a fraction, 


where c, d are positive integers. Furthermore, we may assume that the fraction is in lowest 
terms. Therefore, to show that a!/” 
know that every integer greater than 1 is divisible by some prime. In fact, a surprisingly 


is an integer, it will be enough to show that d = 1. We 


common technique for proving that a positive integer is equal to 1 is to show that it is not 
divisible by any prime. Indeed, this is how we will show that d = 1. 


90 Chapter 3 


eS “» then raising both sides to the nth power results in 
c\n 
f= (9). 
d 


a-d" =c". 


which immediately implies that 


If p is a prime number that divides d, then the preceding equation implies that p divides c”. 

However, Lemma 3.11 now asserts that p divides c. This says that p is a common factor of c 
and d, contradicting our assumption that 5 is in lowest terms. In light of this, it is impossible 
for d to be divisible by any prime number. However, Theorem 3.3 tells us that the only positive 
integer that is not divisible by a prime is 1. Thus, d = 1 and a!/” is indeed an integer. 


We now consider a similar question for the values of the function log; )(x). Consider the list of 
numbers 


log; (1), log ;9(2), log;9(3), log; (4), ..., log;9(99), log;9(100), log;)(101),... 


Some of the numbers like log),(1), log;g(10), log, (100), ... are obviously integers. We will 
show that all the other numbers on the list must be irrational. 


Corollary 3.13. If a is a positive integer, then log,,(a) is either an integer or is irrational. 


Proof. Suppose log ;,9(a) is rational; it will suffice to show that it must be an integer. Since 
log;)(1) = 0 € Z, for the remainder of the proof, we may assume that a > 1. Observe that if 
log g(a) is rational, then we can write it as a fraction, 


Cc 
log jo (a) = ae 


where c, d are positive integers. We will show that © is an integer, and it is worth pointing out 
that we will not need to assume that § is in lowest terms. The preceding equation implies that 


107 =a 
and raising both sides to the dth power results in 
10° =a‘. 
Using the fact that 10 = 2-5, the preceding equation becomes 
2°. 5° =a". 


As a result, if p is a prime that divides a, then p must divide 2° - 5°, and Corollary 3.10 now 
implies that p = 2 or p =5. On the other hand, since 2 and 5 divide 2° - 5°, they must both 


The Integers 91 


divide a’, and Lemma 3.11 now implies that they both divide a. As a result, the prime 
factorization of a must be a = 2°-5’, where s, t are positive integers. Substituting this into the 
preceding equation, we obtain 


2¢ 7 5¢ = (2° P 5')4 = ged . std 
The equation gives two factorizations of the same number, so the uniqueness of prime 


factorization implies that the exponents of 2 and 5 in both factorizations must be the same. 
Therefore, 


c=s-d and c=t-d. 


These equations tell us that 


Therefore, 7 is indeed an integer. 


In light of Corollaries 3.12 and 3.13, we make a remark similar to one in the final paragraph of 
Section 2.2. These corollaries tell us that 2, /6, log j9(3), and log ;9(53) are not rational 
numbers. To assert that they are irrational, we would technically first need to prove that they 
are real numbers. As remarked earlier, this issue will be dealt with in Chapter 4. 


The properties of Z that we have discussed in this chapter will be used throughout this book to 
solve a variety of problems. The four basic concepts about Z that you should always keep in 
mind are prime numbers, division algorithm, Euclidean Algorithm, and the existence and 
uniqueness of prime factorization. Later, when you study sets of polynomials such as Q[x], 
R[x], and C[x], be sure to notice the similarities among their algebraic structures and that of Z. 
In Chapter 12, it will also be interesting to examine the surprising fact that the set Z[x] of 
polynomials with integer coefficients actually has less in common with Z than do the sets 


Q[x], R[x], and C[x]. 


Exercises for Sections 3.4 and 3.5 
1. Show that any two consecutive positive integers must be relatively prime. 
2. (a) Show that any two consecutive positive odd integers must be relatively prime. 


(b) Show that the greatest common divisor of any two consecutive positive even 
integers is 2. 


3. Ifa,b,n €N, show that gcd(n-a,n-b) =n- gcd(a, b). 


92 


10. 


11. 


12. 


13. 


Chapter 3 


If a,b € N and c = gcd(a, b), show that ¢ and b are relatively prime. 
If a, b are distinct positive odd integers and are relatively prime, find gcd(a+b,a-—b). 


If a, b are distinct positive even integers and gcd(a, b) = 2, show that either 
gcd(a+b,a—b) =2 or gcd(a+b, a— b) = 4. Give examples of a and b that show that 
both possibilities can occur. 


Show that if a, b, m,n € N such that a and D are relatively prime, then a” and b” are also 
relatively prime. 


If a,b,c,n € N and c = gcd(a, b), show that c” = gcd(a", b"). 


Suppose a, b,c € N and let d = gcd(a, b). 
(a) Show that gcd(a, b,c) = gcd(c, d). 


(b) Show that gcd(a, b, c) is the smallest positive integer that can be written in the form 
A-a+B-b+C.-c, where A, B,C € Z. 


This exercise will generalize exercise 9. Suppose n, a1, d2,...,d, € N, and let 
c= gcd(a1, a2, ...,@n—1)- 


(a) Show that gcd(a), d2,...,4n) = gcd(an, c). 


(b) Show that gcd(aj, a2,...,d,) is the smallest positive integer that can be written in 
the form A,-a+A2-d2+---+An-adn, where every A; € Z. 


(a) Find integers A, B such that gcd(1477, 770) = 1477A+770B. 


(b) Use your answer from part (a) to find integers C and D such that 
42 = 1477C +770D. 


(c) Do there exist integers E and F such that 45 = 1477E+ 770F? Briefly explain. 


(d) Describe all integers M such that there exist integers G and H such that 
M = 1477G +7704. 


(a) Find integers A, B such that gcd(207, 348) = 207A + 348B. 
(b) Find integers C and D such that 33 = 207C + 348D. 
(c) Do there exist integers E and F such that 20 = 207E + 348 F? Briefly explain. 


(d) Describe all integers M such that there exist integers G and H such that 
M = 207G + 3484. 


(a) Find integers A, B such that gcd(4411, 2486) = 4411A +4 2486B. 
(b) Find integers C and D such that 55 = 4411C + 2486D. 


14. 


15. 
16. 
17. 


18. 


19. 
20. 
21. 


The Integers 93 


(c) Do there exist integers E and F such that 23 = 4411E + 2486F? Briefly explain. 


(d) Describe all integers M such that there exist integers G and H such that 
M = 4411G + 24864. 


(a) Find integers A, B such that gcd(2665, 1976) = 2665A + 1976B. 
(b) Find integers C and D such that 65 = 2665C + 1976D. 
(c) Do there exist integers E and F such that 30 = 2665 EF + 1976F? Briefly explain. 


(d) Describe all integers M such that there exist integers G and H such that 
M = 2665G + 1976H. 


Find integers A, B such that 4 = 4 + 


lb 


Find integers A, B such that an = 4 + #. 
(a) Find integers A, B, C such that gcd(156, 123, 114) = 1564 +123B+114C. 
(b) Find integers D, E, F such that 30 = 156D+123E+114F. 


(c) Do there exist integers G, H, J such that 40 = 156G + 123H + 114/? Briefly 
explain. 


(d) Describe all integers J such that there exist integers K, L, M such that 
J = 156K +123L4+ 114M. 


(a) Find integers A, B, C such that gcd(952, 700, 546) = 952A + 700B + 546C. 
(b) Find integers D, E, F such that 42 = 952D+ 700E + 546F. 


(c) Do there exist integers G, H, J such that 32 = 952G + 700H + 546/? Briefly 
explain. 


(d) Describe all integers J such that there exist integers K, L, M such that 
J = 952K + 700L + 546M. 


Find integers A, B, C such that sa0 = 4 + 8 + £. 


oa: tt 2 A cB i iC 
Find integers A, B, C such that m0 =~ ptata- 


You are given a scale that works in the following manner: You can place weights on the 
left and right sides of the scale. The side of the scale with the greater total weight 

will tilt downward. If both sides of the scale have the same weight, there will be no 
movement. Suppose you are given a large collection of 25-ounce weights and 15-ounce 
weights. 

(a) Explain two different ways that you could determine if a pebble weighed 5 ounces. 


94 Chapter 3 


22. 


23; 


24. 


25. 


26. 
27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


(b) Explain why you could not determine if a pebble weighed 1 ounce. 


(c) If you were additionally given a single 2-ounce weight, could you determine if a 
pebble weighed | ounce? In this case, could you determine if a pebble weighed 
3 ounces? Explain your answers. 


(d) Suppose instead of being given one 2-ounce weight, you were given two 2-ounce 
weights. Could you now determine if a pebble weighed 1 ounce? Explain. 


Prove the uniqueness portion of Theorem 3.3 with a proof that does not use Mathematical 
Induction but does use the Well Ordering Principle in a proof by contradiction. 


If n = 27.34.77. 133 and m = 29-3°-5-114-13!, find gcd(n, m). 
If a= 34*.5!.119-177-19°, b= 27.33.52. 113-177-238, and c = 2! . 59.73. 114.233, 
find gcd(a, b, c). 


If a,b €N, let the least common multiple of a and b, denoted Icm(a, b), be the smallest 


positive integer that is a multiple of both a and b. Prove that /cm(a, b) = mane 


If n,m are as in exercise 23, find lcm(n, m). 


If n is a positive integer and A, B, C, D, E € Z such that n = DABS 5B. 7? SIT e and 
n=2°.3).58.7£.11?, find A, B,C, D, E. 


If n is a positive integer and A, B,C, D, E € Z such that n = 34.11°.17%.23° and 
n=2?.38.11".23°, find A, B, C, D, E. 


If m is a positive integer and x, y € Z such that m = 5**+”. 294 and m = 5!! .29°*-”, find 
x and y. 


If m is a positive integer and x, y € Z such that m = 7° - 134**! and m = 7*+39-13!?, 
find x and y. 


If a, b,c, d are integers such that = = 24.3°.5¢.74% find a, b,c, d. 
If a, b, c, d are integers such that & = 24.3°.5¢.74 find a, b,c, d. 


Let S = {p1, P2,--++ Dns V1 92> -++> Ym} be a set consisting of n + m different prime 
numbers. Prove that the number (p1 - p2--+ Pn) + (41° 92°** Ym) 18 not divisible by any of 
the elements of S. 


Use exercise 33 to come up with another proof that there is an infinite number of 
primes. 


One of the main ideas of this chapter was the unique factorization of positive integers 
into primes. In this exercise, we will see that if we allow negative exponents, then we can 


36. 


37. 


38. 


39. 
40. 
41. 


42. 


43. 
44, 


The Integers 95 


uniquely factor into primes those positive rational numbers that are in lowest terms. 
Observe this was the idea you applied in exercises 31 and 32. Now suppose 
{P1 < p2 <--++ < p;} are prime numbers such that 


b b b 
pit > po +++ py = py? - po? +++ pe, 


where every q;, b; is an integer. Show that a; = b;, for every i. 


(a) 


(b) 


(b) 


(c) 


For each of the following integers, determine how many positive integers are 
divisors of the given number: 8, 9, 10, 15, 16, 17, 24, 25, 26. 


In which of these cases is the number of divisors odd? Do you detect any pattern? 


For each of the following integers, determine how many positive integers are 
divisors of the given number: 27, 28, 29, 35, 36, 49, 50, 51, 64. 


In which of these cases is the number of divisors odd? Do you detect any pattern? 


Suppose n > | is a positive integer and n = p,"'- p2” +++ pm'™ is the prime 
factorization of n (you may assume that p; < p2 <--- < p»). Show that the 
number of positive integers which are divisors of n is (t} +1)-(t2+1)--- (tn +1). 


Look at the prime factorizations of the numbers in part (a) of either exercise 36 or 
37. Then check that in all cases the formula from part (a) of this problem yields the 
same number of positive divisors. 


Use part (a) of this problem to show that n € N will have an odd number of positive 
divisors if and only if n is a perfect square. Then examine your answer to part (b) of 
either exercise 36 or 37. 


Without using Corollary 3.12, prove that 73 is not a rational number. 


Without using Corollary 3.12, prove that (210)5 is not a rational number. 


(a) 
(b) 
(c) 


Show that log,(16) is not a rational number. 

Find all positive integers m such that log; (m) is rational. 
Find all positive integers m such that log,,(m) is rational. 
Show that log3¢(40) is not a rational number. 

Find all positive integers m such that log3¢(m) is rational. 


Find all positive integers m such that log, (m) is rational. 


Show that there cannot exist rational numbers a, b such that J3=at+byv2. 


Show that there cannot exist rational numbers a, b such that 15 =a+ bV6. 


96 


45. 


46. 


47. 


48. 
49. 


50. 
51. 


Chapter 3 


Generalize the results in exercises 43-44 and show that if n,m € N such that none of 
Jn, /m, /nm are rational, then there cannot exist rational numbers a, b such that 


Ji=at+bJm. 
n+l 


! ! 
(a) Ifn,k € N such that k <n, show that 7, DE mt+a OTH = Ge 


(b) Use part (a) and the Well Ordering Principle to show that if n € N and if k is an 


integer with O < k <n, then Gore is an integer. 


(c) Use part (b) to conclude that if m € N, then the product of any m consecutive 
positive integers is divisible by m!. 


If we let Q* denote the set of positive rational numbers, define the function 
f:Q'on 
as iC) = 2? .31, where the rational number ; is in lowest terms and p, gq € N. Compute 
f), FG), and FQ). 
Prove that the function f : Qt —> N in exercise 47 is an injection. 
If we let Z[x] denote the set of polynomials with integer coefficient, define the function 
g: Z[x] > QT 
as 
B(dy Fa x + agX? ++ +++ ayXx") = 2-34 5%. Dag, 


where each a; € Z and p; represents the fth smallest prime number, for all ¢ € N. 
Compute g(34+ 2x), g(—5+4x7), and g(3 — 4x — 6x” + 5x4). 


Prove that the function g : Z[x] > Qt in exercise 49 is an injection. 


Use exercises 48 and 50 to show that there exists a function h : Z[x] — N which is an 
injection. 


The Rational Numbers and the 
Real Numbers 


Much of this course is focused on the solutions of various types of equations. Let us look at 
two equations involving polynomials with integer coefficients 


2x-1=0 and x*-—2=0. 


In the previous chapter, we examined some of the interesting properties of the integers. 
However, the integers are not a big enough number system to contain the solutions of the 
preceding equations. 


The solution to 2x-—-1=Oisx= 5 and 5 is a rational number that is not an integer. Therefore, 


our investigation of the roots of polynomials will lead us beyond the integers to the set of 
rational numbers Q. Similarly, the solutions of x? -2=Oare x =+VJ/2 and +V2 are real 
numbers that are not rational. Thus, additional investigations will take us beyond the rational 
numbers to the real numbers R. Furthermore, to find the roots of the polynomial x? + 1, we 
will need to move beyond the real numbers to the complex numbers C. In this chapter, 

we will examine the rational numbers and the real numbers. The complex numbers will be 
examined in Chapters 5 and 6. 


4.1 Rational Numbers 


There are two approaches we could take when studying the rational and real number systems. 
The first approach is based on the assumption that our previous courses in calculus have 
given us an intuitive understanding of the real numbers. This allows us to view the rational 
numbers as those real numbers that can be written as a quotient of integers. We then show 
that this corresponds to those real numbers whose decimal expansion eventually repeats. 


Later in this chapter, we present the second and more theoretical approach. In this approach, 
we view the rational numbers as essentially being pairs of integers. We then construct the real 
numbers by looking at certain sequences of rational numbers. Both approaches will present us 
with some subtle and important issues that we will need to deal with. Our motivation for 
studying the rational and real numbers is to understand and find the solutions to various 


Copyright © 2010 by Elsevier Inc. All rights reserved. 97 


98 Chapter 4 


equations. If you are interested in a more complete and detailed look at the construction of the 
real numbers, it can be found in a book on real analysis. We begin the first approach with 


Proposition 4.1. A real number a can be written as a quotient of integers if and only if its 
decimal expansion eventually repeats. Real numbers of this form are called rational numbers, 
and we denote the set of rational numbers as Q. 


Intuition. Let’s consider the decimal expansion of 2 It is obtained by long division and 
the expansion begins as 


23 
7 = 3.285714--- 


(You should get a paper and pencil and perform the long division.) When we do the long 
division, the first seven remainders we obtain, in order, are 2,6, 4,5, 1, 3, and 2. Note that the 
first time we obtained a remainder of 2, it was followed by the remainders 6, 4,5, 1,3, and 2. 
Therefore, every time we have a remainder of 2, it will be followed by remainders of 

6,4,5, 1,3, and 2. As a result, the sequence of remainders 2, 6, 4,5, 1, and 3 will continue to 
repeat forever. However, these remainders cause the term 285714 to appear in the decimal 
expansion of 2. Therefore, the term 285714 repeats forever in the decimal expansion of 33, 
and we have 


2 
= = 3.285714285714285714285714285714285714285714- -- 


Remember, the division algorithm told us that when dividing by 7, the only possible 
remainders are 0, 1,2, 3,4,5, and 6. Therefore, when performing long division by 7 to find 
the decimal expansion of a fraction whose denominator is 7, one of the remainders must 
eventually occur a second time. As we just saw, that will cause the terms in the decimal 
expansion to repeat forever. Certainly there is nothing special about the number 7. No matter 
what the denominator of a fraction is, the division algorithm allows only a finite number of 
possible remainders. Therefore, one of the remainders eventually appears a second time, and 
this causes the terms in the decimal expansion to repeat forever. 


For a second example, consider the decimal expansion of 2. The decimal expansion 
begins as 


is = 5.615384 
1300 


Using long division by 13, the first seven remainders obtained, in order, are 8,2, 7,5, 11, 6, 
and 8. Therefore, the sequence of remainders 8, 2, 7,5, 11, and 6 will repeat forever. This 
results in the term 615384 repeating forever in the decimal expansion of 2. Thus, we have 


7 
5 = 5.615384615384615384615384615384615384615384-- - 


The Rational Numbers and the Real Numbers 99 


It is also worth remembering that if a remainder of 0 ever occurs, then the decimal expansion 
terminates. However, in this case, we can still consider this as a decimal expansion that 
eventually repeats, since 0 repeats forever. For example, we have 


~ ~ 7.375 = 7.3750000000000- 


We will now examine an algorithm you were probably first introduced to in elementary or 
high school. It allows us to rewrite any real number whose decimal expansion eventually 
repeats as a quotient of integers. Suppose we are given a real number whose decimal 
expansion eventually repeats, such as 


a = 54.64915915915915915915915915 --. 


The term 915 repeats forever, and we can multiply a by an appropriate power of 10 so the 
first appearance of the term 915 is immediately to the right of the decimal point. In 
particular, we have 


(1) 107 -w = 5464.915915915915915915915915--- 


We can also multiply a by a different power of 10 so now the second appearance of the term 
915 is immediately to the right of the decimal point. In particular, we also have 


(2) 10° - @ = 5464915.915915915915915915915915--- 


We have arranged things so all the terms to the right of the decimal point in both (1) and (2) 
are identical. Therefore, if we subtract equation (1) from equation (2), we obtain 


10°-a@— 107 -a = 5464915 — 5464, 


which tells us that 
=: 5459451 
~ 99900 ° 


There are some subtle issues related to our assertion that all the terms to the right of the 
decimal point cancel out when we subtract equation (1) from equation (2). We will deal with 
these issues after the proof of Proposition 4.1. However, even if one is unaware of these 
subtleties, you can always use this algorithm to rewrite a real number with a decimal 
expansion that eventually repeats as a quotient of integers. 


Before proving Proposition 4.1, we need to briefly address the question “What is a real 
number?” In other words, what does the expression “decimal expansion” really mean? 
To help answer this, let us consider 


B = 1021.3437052096436887541987 - - - 


100 Chapter 4 


The number 6 refers to the sum or infinite series 


ie ee ee 
10! "102" 103 " 104 * 105 " 106 * 107 * 108 * 109 * 1010 * 101 


3 6 8 8 7 3 4 1 9 8 7 

102 * 103 * Toe * TOs * ToIe * To * Tos T TO * 192% Tor ORT 
Therefore, to truly understand the nature of the number £, we need to understand infinite 
series and convergence. Real numbers can be viewed as convergent infinite series of rational 
numbers. On the one hand, we can view the integers and rational numbers as purely algebraic 
objects, since they are defined in terms of addition, subtraction, multiplication, and division. 
On the other hand, the real numbers are defined in terms of limits of convergent series. Thus, 
the real numbers are not purely an algebraic object. Therefore, it is not surprising that the 
proof of Proposition 4.1 will require a basic fact from calculus on the convergence of certain 
series. We will state this familiar fact without proof. 


Lemma 4.2. Let a andr be real numbers such that |r| < 1. Then 


atar+ar*+ar+ar'+.--ar"+ar™t!4...= 


t= 
m Examples 
" a ae ait 
rat ant 53 tat ta 
3, 3,3 ,3,3 0 8 id 
10 102° 103° 104° 105 — i-i 3 
L_ a 


The types of series referred to in Lemma 4.2 are known as geometric series. When viewing 
real numbers as infinite series, an important observation to make is that those real numbers 
whose decimal expansions eventually repeat are precisely those where the corresponding 
infinite series is a sum of a finite number of terms followed by a geometric series. This will 
become clearer during the proof of Proposition 4.1. 


Lemma 4.3. Let n be a positive integer and let d,, dz, ..., dy, be elements of the set 
{0,1,2,3,...,8,9}. Then the repeating decimal 


PA Preor, S G Peane Seee,  e W  , aa 


dda:--dn 
10"—-1 ° 


is equal to the fraction 


The Rational Numbers and the Real Numbers 101 


Intuition. It is important to not be confused by the notation. The term d\d2---d, refers toa 
listing of n digits in a row. It represents an integer that is greater than or equal to zero and is 
less than 10”. It does not represent the product of the d,’s. In particular, the lemma 

asserts that 


1492149214921492149214921492... — 422. _ 1492 
~ 104—1 9999’ 
2 72 8 


.072072072072072072--- = = = . 
107-1 999 111 


Proof. We can interpret the repeating decimal 
Gido+dydyih »-0,didb: “dj did dd dl ibe da 


as representing the geometric series 


Gide sds a od: Hid sd ide ds 
10” 102” 103” 10%" 10°” 


Applying Lemma 4.2, with a= Ady dn and r= we see that the sum of this series is 


1 
To"? 


We can now prove Proposition 4.1. Not surprisingly, the proof will use the division algorithm. 
More precisely, we need the fact that when we divide by an integer b, there are only a finite 
number of possible remainders. 


Proof of Proposition 4.1. It is easy to see that a can be written as a quotient of integers if and 
only if —a@ can. Similarly, the decimal expansion of a eventually repeats if and only if the 
same is true for —a. Furthermore, the result is certainly true when a = 0. Therefore, to prove 
the proposition, it is enough to consider the case where a is positive. In one direction, let us 
suppose that the decimal expansion of a eventually repeats. Therefore, there is a block of 
digits of the form dd ---d,, that, at some point to the right of the decimal point, eventually 
repeats. Thus, there exist digits a1, a2, ..., Gm Such that 


Of = A A2- ++ Ay.As4.1 ++ Odd ~~ dy dy -- + dydydy ~~» daddy -+-dy--- 


Note that the decimal point occurs between a, and a;+1, which is to the left of the block 
d\dz---d,. We can multiply a by an appropriate power of 10 to move the decimal point to 
the right and obtain 


190”-S -@ = A1AQ°+ + AsAs41 °° “Am dda erie -dyd\ do ae » dnd do ote -dy ous 


102 Chapter 4 


Therefore, 
10 = aia dds dit sd dab a did 
When we apply Lemma 4.3 to the repeating decimal 
Gidos “dgdids: «yds sd dides dd do dae 


the previous equation becomes 


a3 did ---dy, 
10"-* .@ = ia an 

= a1AQ°+- An + 10" =] 
Dividing both sides by 10” yields 


G12 +++ Am d\dy--+dy 
=< TORS 10-5. (10" — 1)" 


Adding the two fractions on the right-hand side of the previous equation, we obtain 


(10” — 1) - (aja2--+am) +ddz---dy 
a= : 
10"-*. 10" — 1) 


Therefore, we have succeeded in writing a as a quotient of integers. 


On the other hand, suppose a@ can be written as a where a, b € N. Let us consider the 
remainders that are obtained when we divide the numbers 10, 102, 10°, 10*,... by b. Since 
all of the remainders are nonnegative and less than b, eventually we can find positive integers 
n,m with n > m such that 10” and 10” have the same remainder when divided by b. The 
division algorithm asserts that there exist integers g1, g2, r with O < r < b such that 


10" =q,-b+r and 10” =q)-b+r. 
Thus, 
10” — 10" = (q,-b+r) —(q@2-b +r) = (i -@)-b, 


which is certainly a multiple of b. If we multiply both sides of the preceding equation by a 
and divide by b- (10” — 10”), we obtain 


_a_ (q-4q)-a 
— b 107-10" ° 


We can certainly rewrite the preceding equation as 


. 1 @—q)-a 


10" 10"-™—-1° 


The Rational Numbers and the Real Numbers 103 


Next, we apply the division algorithm and divide (q; — g2)-a by 10”~” — 1 to obtain 
(q1—42)-a=c-(10"" —1) +4, 


where 0 < d < 10”"-” — 1. Substituting the result of this equation into the previous one, 


we see that 
_ 1 7 d 
Oo Tom AE Ton=m —q SY 


Since 0 < d < 10"”"” —1, there exist digits d), dz, ---dy,—m such that d = d,dz---d,_—». In 
addition, we can also write c as a block of digits a,az --- a; to obtain 


1 ddy-++dn—m 
= — -la eres. ——_ ]. 
10” 142 at 


10"-" —] 


Note that we have some control over the number of digits that appear in c as we could attach 
as many zeroes on the left as we wish. This means that if necessary, we could extend the 
length of c. In the final step of the proof, we will need t — m to be at least 0. Therefore, let us 
now extend the length of c and make t large enough that t—m > 0. 


di da-*-dn—m 


Applying Lemma 4.3 to Spa 


gives us 


and substituting that result into the previous equation 


1 
On aR (419 +++ Ay-d\ d+ ++ dn—md\d2 +++ dn—mdid2-++dn—-m*:-). 


Since division by 10” moves the decimal point m places to the left and t—m > 0, we 
see that 


Of = A A2+ ++ At—-m-At—m41 +++ Adz > ++ dn—mdyd2 +++ da—mdid2+--dn—m-:: 


As a result, the decimal expansion of a does indeed eventually repeat. 


We can now consider some of the subtle points that lie beneath the surface when we rewrite a 
real number with a decimal expansion that eventually repeats as a quotient of integers. 
To this end, let’s look at two calculations which both lead to incorrect conclusions. Suppose 


a=14+2+4+8+16+::-. 
If we multiply this equation by 2, we obtain 
2a=2+4+8+4+16+-:-. 


The only difference between the right-hand sides of these two equations is that the top 
one has a | in it and the bottom one doesn’t. Therefore, if we subtract the bottom equation 
from the top, we obtain 


—a=l, 


104 Chapter 4 


which implies that a = —1. However, this is clearly false. 


Let’s look at one more example. Suppose 
b=10'+10°+ 10° +10*+10°+---. 
Multiplying this equation by 10 yields 
10b = 10° + 10° + 10° + 10° +---. 
Subtracting the bottom equation from the top yields 
—9b= 10, 


which implies that b = — ». This also is clearly false. 


These two calculations are very similar to the one where we showed that the real number 
54.64915915915915915915915915--- could be rewritten as the quotient at . In all of 
these calculations, we canceled out two copies of a sum of an infinite number of terms. Since 
one of these calculations gave us the correct answer and two gave us incorrect answers, it 


raises the question, what went wrong in two of these calculations? 


In all of these calculations, we have a sum of an infinite number of terms like 


915 es 915 fe 915 ip 915 es 915 ea 
10°10" 40? 7 tole" 10" 


or 
244484164+32+4--- 


appearing in more than one equation. We then subtracted one equation from another and 
claimed that all of the identical terms in the two sums canceled out. Now comes the subtle 
but essential point. It is only valid to cancel out a sum of an infinite number of terms from 
two equations if the sum you are dealing with converges. By Lemma 4.2, the sum 


915 “i 915 es 915 5p 915 = 915 
10? 108-10? 1912" 108 
converges and represents a real number. Therefore, it is valid to cancel out two copies of 


915 = 915 i 915 a 915 “ 915 a 
102 * 10° "10." 10! 19 


when trying to write 
54.64915915915915915915915915-.- 


as a quotient. 


The Rational Numbers and the Real Numbers 105 
However, the sum 
2+44+8+164+32+--- 
does not converge and does not represent a real number. Therefore, 
2+44+8+16432+.--- 


cannot be used in calculations and cannot be subtracted from itself to give 0 as answer. This 
explains why attempts to cancel two copies of 


2+44+8+164+32+.--- 
leads to the contradiction 
—1=1424+4484164+32+4.---. 


To summarize, when we use an algorithm to rewrite a real number with a decimal expansion 
that eventually repeats as a quotient, we are usually not concerned or interested in the theory 
surrounding geometric series and convergence. However, as the preceding examples indicate, 
convergence is indeed necessary if we are to perform a calculation that requires canceling 
two copies of a sum that involves an infinite number of terms. 


4.2 Intermediate Value Theorem 


In Chapters 2 and 3, we showed that V2, +/6, log, (3), and log; (53) were not rational 
numbers. Phrased somewhat differently, this means that the equations 


~=2, vr =6, 10°=3, 10°=53 


do not have any solutions in the rational numbers. This raises the question whether these 
equations have solutions in the real numbers? If these equations do indeed have real 
solutions, then s/2, /6, log; (3), and log,9(53) are real numbers that are not rational, so we 
can say that they are irrational. The mathematical tool needed to show that the preceding 
equations have solutions in the real numbers is the Intermediate Value Theorem. 


The Intermediate Value Theorem can be found in calculus and real analysis texts but is 
usually not found in abstract algebra books. However, it is an important tool for finding roots 
of polynomials and helps illustrate key differences between the rational numbers and the real 
numbers. The Intermediate Value Theorem will appear in Chapter 6 to help prove the 
Fundamental Theorem of Algebra and again in Chapters 9 and 17 in our work on the 
insolvability of the quintic. Since abstract algebra students should be familiar with 

the Intermediate Value Theorem, we state it now and will prove it later in this section. 


106 Chapter 4 


Theorem 4.4—The Intermediate Value Theorem. Suppose the function f(x) is continuous 
on the closed interval [a, b]. If f(a) < 0 and f(b) > 0, then there exists some real number c in 
the open interval (a, b) such that f(c) = 0. 


In calculus, we deal with functions whose domains are often the entire set of real numbers. 
We do not look at functions whose domains are merely the rational numbers. The reason for 
this can be seen in the following example. Consider the polynomial f(x) = x? — 2. Certainly 
fC) <Oand f(2) > 0. However, there is no rational number c between | and 2 such that 
f(c) = 0. This is because J/2 is not a rational number. This example indicates that if we 
restrict the domains of our functions to the rational numbers, then important theorems like the 
Intermediate Value Theorem no longer hold. In light of this, there must be something special 
about the real numbers that does not hold for the rational numbers that enables us to prove the 
Intermediate Value Theorem. To proceed, we must first introduce some terminology. 


Definition 4.5. Let S be a subset of R. An element c € R is called an upper bound for S if 
c> 5, forallséS. 


Observe that if c is an upper bound for S, then any number bigger than c is also an upper 
bound for S. Therefore, if a set S has one upper bound, then it has an infinite number of 
upper bounds. 


m Examples 


1. Let S be the open interval (3,5); then some of the upper bounds for S are 
5, 6, 277, 1988. 


2. Let S be the closed interval [3, 5]: then some of the upper bounds for S are 
3,6, 27, 1988. 


3. LetS= {2 | n € N}; then some of the upper bounds for S are 3, xr, 100, 1025. 

4. let S= {3 Ine N}; then S has no upper bounds. 

5. Let S={x eR|x? <2}; then some of the upper bounds for S are J/2, 2,3, 2001. 
6. Let S={x €Q|x* <2}; then some of the upper bounds for S are J2, 2,3, 2001. 


7. Let S={xeR| x? > 2}; then S has no upper bounds. 
= | 


In our seven examples, five of the sets have upper bounds. Certainly, none of these sets has a 
largest upper bound, for if c is an upper bound for a set, then c+ 1 is a larger upper bound. 
However, observe that each of our five sets that has an upper bound has a smallest one. In 


The Rational Numbers and the Real Numbers 107 


examples (1) and (2), the smallest upper bound is 5; in example (3), the smallest upper bound 
is 3; and in examples (5) and (6), the smallest upper bound is ./2. This is no accident, and it is 
precisely this special property of the real numbers that enables us to prove results like the 
Intermediate Value Theorem. Before stating this property, we need one more definition. 


Definition 4.6. Let S be a subset of R. An element c € R is called a least upper bound for S if 
(a) cis anupper bound for S, and 
(b) ifd is any upper bound for S, then c < d. 


Note that the least upper bound of a set S need not be an element of the set S. Observe that 5 is 
the least upper bound of (3,5) and 5 ¢ (3, 5). Similarly, /2 is the least upper bound of both 
{x €R| x? <2} and {x €Q| x? <2}, yet ./2 does not belong to either set. We can now state 
the key property of the real numbers that makes the real numbers significantly different from 
the rational numbers. 


Least Upper Bound Property of the Real Numbers Let S be a nonempty subset of R. If S 
has an upper bound, then there is a real number c such that c is the least upper bound of S. 


Before proceeding, there are some observations we should make about the Least Upper 
Bound Property. 


(a) If c; and c2 are both least upper bounds for S, then they are both upper bounds for S, 
and they are both less than or equal to any other upper bound for S. This tells us that 
Cy < cz and cz < cy, SO cy = C2. AS a result, a set can have at most one least upper 
bound. For that reason, we refer to the least upper bound as opposed to a least upper 
bound. 


(b) Ifthe set S is empty then, technically, every real number is an upper bound for the set. 
Therefore, the set would not have a smallest upper bound. For this technical reason, 
when discussing least upper bounds, we only deal with sets that are nonempty. 


(c) Suppose S is a nonempty subset of Q that has an upper bound, then S is certainly a 
nonempty subset of R that has an upper bound. Therefore, S must have a least upper 
bound, although the least upper bound need not be rational. For example, if 
S = {x € Q| x? < 3}, then the least upper bound of S is 3, which is not rational. In 
light of this, we can say that Q does not satisfy the Least Upper Bound Property. By 
this we mean that although nonempty subsets of Q that have an upper bound must have a 
least upper bound, the least upper bound does not necessarily belong to Q. As our 
example indicates, the least upper bound must belong to R but may not belong to Q. 


Thus, the crucial difference between the real numbers and the rational numbers is that the 
Least Upper Bound Property is satisfied by R but not by Q. Whereas the Intermediate Value 


108 Chapter 4 


Theorem does not hold for functions defined only on the rational numbers, the Least Upper 
Bound Property will enable us to prove the Intermediate Value Theorem for functions defined 
on the real numbers. To prove the Intermediate Value Theorem, we must first review the 
definition of continuity. 


Definition 4.7. A function f : R —> R is said to be continuous at the point x = c if for 
every € > 0 there exists a 5 > 0 such that whenever |x —c| < 6, we have | f(x) — f(c)| <. 


Intuitively, if a function is continuous at x = c, it tells us that we can keep all the values of 
f(x) as close to f(c) as we desire as long as we only plug in numbers close to c. In particular, 
if f(c) > 0, then there is an open interval J containing c such that whenever a point in / is 
plugged into f(x), the result is positive. Similarly, if f(c) <0, then there is an open interval J 
containing c such that whenever a point in J is plugged into f(x), the result is negative. These 
observations will be needed in the proof of the Intermediate Value Theorem. In calculus, there 
is a slight difference between the concepts of a function being continuous on the closed 
interval [a, b] and a function being continuous at every point of the interval [a, b]. The 
differences concern what happens just to the right of the point b or just to the left of the point a. 
However, for the purposes of this course, we will be interested in using the Intermediate Value 
Theorem to show that various polynomials have real roots. Since polynomials are continuous 
everywhere, such technical concerns do not arise in our applications. For this reason, when we 
use the phrase that a function is continuous on the interval [a, b], we will give that phrase the 
somewhat nonstandard meaning that the function is continuous at every point in [a, b]. 


Proof of the Intermediate Value Theorem. Let S = {x € [a, b] | f(x) < 0}. The set S is certainly 
nonempty as a € S, and, in addition, b is an upper bound for S. Therefore, the Least Upper 
Bound Property asserts that S has a least upper bound, and we will denote it as c. Since ae S, 
we see that a < c, and since b is an upper bound for S, we see that c < b. Thus, c € [a, b]. 
There are now three possibilities; f(c) > 0, f(c) <0, or f(c) = 0. We will show that each of 
the first two possibilities leads to a contradiction, so we will be able to conclude that 


fic) =0. 


Suppose that f(c) > 0; if we let e = fo then the continuity of f(x) at x = c implies that there 
exists some 6 > 0 such that if x is within 6 of c, then f(x) is with fo of f(c). Since f(a) <0 
and f(c) > 0, it follows that c > a. Therefore, there exists some d, with the properties that (i) 
d, is within 6 of c, (ii) dy < c, and (iii) dj > a. 


If x belongs to the closed interval [d,, c], then x is within 6 of c. Therefore, f(x) is within fo 
of f(c). As aresult, f(x) > ug) > 0. Now suppose e € S; since c is an upper bound for S, 
e<c. Since f(e) <Oand f(x) > 0, for all x € [d), c], it follows that e does not belong to the 
interval [d,, c]. Thus, e < d,. This tells us that d; is both an upper bound for S and is less than 
c, which contradicts the fact that c is the least upper bound of S. Therefore, it is impossible for 
f(c) to be positive. 


The Rational Numbers and the Real Numbers 109 


7ITN 
€ 
f(c) 4 ay ies 
€ 
NY 6 
oO 
| cae, 
a d, c 


Now suppose that f(c) < 0; in this case, we let € = — FO. The continuity of f(x) atx=c 
implies that there exists some 6 > 0 such that if x is within 6 of c, then 30) < f(x) < LO <0. 
Since f(b) > 0 and f(c) <0, it follows that c < b. Therefore, there exists some d2 with the 
properties that (i) dz is within 6 of c, (11) dz > c, and (iii) dz < b. 


N 


fc) eS 


Since d2 € [a, b] and f(dz) < 0, we see that d) € S. Therefore, dz is an element of S that is 
greater than c, and this contradicts the fact that c is an upper bound for S. Thus, it is also 
impossible for f(c) to be negative. 


As a result, we now know that f(c) = 0. Since f(a) < 0 and f(b) > 0, we see that c # a and 
c #b. Therefore, c is indeed an element of (a, b) such that f(c) = 0, as desired. 


In mathematics, we often define certain objects in terms of the properties we would like 

them to have. For example, we define 2 to be the positive real number whose square is equal 
to 2. Similarly, we define 73 to be the real number whose cube is equal to 7. However, just 
because we make a definition in no way guarantees that there exists an object that has those 
properties. For example, we could define an object to be the positive rational number whose 
square is 2, but, as we know, there is no rational number whose square is 2. Similarly, we 
could define an object to be a real number whose square is —1, but there is no real number 
with that property, since the square of every real number is greater than or equal to zero. 


110 Chapter 4 


We can now use the Intermediate Value Theorem to easily show that real numbers like a2 
1 
and 73 do indeed exist. 


Corollary 4.8. Let a,n ¢ N. Then there exists a positive real number c such that c" = a. 
1 
In other words, there does exist a real number a. 


Proof. Tf a= 1, then c = 1 will work, and therefore we may now assume that a > 1. Let f(x) 
be the polynomial x” — a. Observe that 


f@)=-a<0 and f(a) =a"-—a=a(a"!-1)>0. 


Since polynomials are continuous everywhere and f(x) goes from negative to positive on the 
interval [0, a], the Intermediate Value Theorem tells us that there exist some c € (0, a) such 
that f(c) =0. As a result, c” — a = 0, which implies that c” = a. 


At first glance, the Intermediate Value Theorem seems to tell us that roots of certain 
polynomials exist, but it doesn’t appear to help us find those roots. However, if we consider a 
real number to be a decimal expansion, then the Intermediate Value Theorem does indeed 
provide an algorithm for finding roots of polynomials. 


m Example: Using the Intermediate Value Theorem to find /2 


Let f(x) =x? —2. Since f(1) <Oand f(2) > 0, the Intermediate Value Theorem 
guarantees that there is a root between | and 2, so 


P2y72 22. 


Next, partition the interval [1,2] into the ten equal subintervals 


[1, 1.1], [1.1, 1.2], [1.2, 1.3], [1.3, 1.4], [1.4, 1.5], 
[1.5, 1.6], [1.6, 1.7], [1.7, 1.8], [1.8, 1.9], [1.9, 2.0]. 


Since f(x) has no rational roots, none of the endpoints of the ten subintervals will be a 
root. However, since f(x) changes from negative to positive in the interval [1, 2], it must 
change from negative to positive in one of the ten subintervals. Therefore, if we plug the 
endpoints of our ten subintervals into f(x), we will see where f(x) changes from negative 
to positive. In particular, f(1.4) <0 and f(1.5) > 0, so 


1424/2 = 15: 


www-.elsevierdirect.com 


The Rational Numbers and the Real Numbers 111 


We can now iterate the above procedure. Partition the interval [1.4, 1.5] into ten equal 
subintervals and plug the endpoints of each subinterval into f(x) to determine on which 
subinterval f(x) changes sign. We will see that (1.41) <Oand (1.42) > 0, so 


1.41 < V2 < 1.42. 


We can apply this algorithm as many times as we desire, to determine the decimal 
expansion to as many decimal places as we wish. For example, if we apply this algorithm 
ten times, we will see that 


1.4142135623 < /2 < 1.4142135624. 
If we apply this algorithm twenty times, we will see that 
1.41421356237309504880 < V2 < 1.41421356237309504881. 


As this example indicates, the Intermediate Value Theorem not only tells us that there 
exists a positive real number whose square Is 2, but it also provides us with an algorithm 
to compute its decimal expansion to any desired number of decimal places. 


In Chapter 6, we will prove the Fundamental Theorem of Algebra. There are many equivalent 
ways to state the theorem, and one of them is that any polynomial with real coefficients of 
degree at least one has a root in the complex numbers. Although there are certainly 
polynomials with real coefficients of even degree that do not have real roots, such as x” + 1 
and x4 + 3x?+7, we can use the Intermediate Value Theorem to see that all polynomials of 
odd degree with real coefficients must have a real root. 


Corollary 4.9. Let p(x) be a polynomial of odd degree with real coefficients. Then p(x) has a 
root in R. 


Intuition. If the leading term of p(x) is positive, then as x — oo, the leading term of p(x) 
dominates and p(x) — oo. Similarly, as x + —oo, the leading term of p(x) again dominates. 
However, since the exponent of x is odd and the leading coefficient is positive, the leading 
term now approaches —oo. As a result, as x > —oo, it follows that p(x) — —oo. Therefore, 
there exists a negative number a such that p(a) < 0 and a positive number b such that 

p(b) > 0. Since polynomials are continuous everywhere, the Intermediate Value Theorem tells 
us that there is some c between a and b such that p(c) = 0. 


If the leading term of p(x) is negative, we can consider the polynomial — p(x). The leading 
term of — p(x) is positive and the preceding argument implies that there is a real number c 
such that — p(c) = 0. However, it is now clear that p(c) = 0. 


www-.elsevierdirect.com 


112 Chapter 4 


Proof. Let 
p(x) = anx" + agi x"! +--+ +a4x1 +40, 


where the a; € R and n is odd. The preceding argument indicates that if a, <0, we could 
instead consider the polynomial — p(x). Therefore, without loss of generality, we may assume 
that a, > 0. 


Observe that 


P(X) = Gyx" + dg_1x" | +++ ax +a9 = x" (an + a ~*). 
x xn-l xn 


If x is a real number such that |x| is sufficiently large, then each of the terms 

an-1 a, 40 
can be made as close to zero as we desire. Therefore, by choosing x so that |x| is sufficiently 
large, we can make the sum 


an—1 ay ao 
an + MES eae vol 
as close to a, as we wish. Since a, > 0, this tells us that when |x| is large, 
aAn—| a\ ao 
(3) ay +} + +5 +— > 0. 


In particular, we can find a negative number a and a positive number b with sufficiently large 
absolute values that they both make the expression in (3) positive. 


Since a < 0 and n is odd, a” < 0. Therefore, 


p(a) = ana” +an—1a"' +++» +aja+ay =a" (m ular — + ~) 
a a a 
is the product of a negative number and a positive number. Thus, p(a) < 0. 
Similarly, since b > 0, we have b” > 0. Therefore, 
an— a a 
p(b) = ayb” + ay_1b"! +++» +.a;b+ay =b" (an + Eee a a ee) 


is the product of two positive numbers. Thus, p(b) > 0. 


Since p(a) < 0, p(b) > 0, and polynomials are continuous everywhere, we can apply the 
Intermediate Value Theorem to conclude that there exists a real number c between a and b 


such that p(c) = 0. 


Note that the Intermediate Value Theorem can be applied to all polynomials of odd degree to 
find real roots, but it can only be applied to some polynomials of even degree to find real roots. 


The Rational Numbers and the Real Numbers 113 


For example, we certainly used the Intermediate Value Theorem to find a real root of x? — 2. 
However, we cannot apply the Intermediate Value Theorem to find a real root of x? + 1, since 
the values of this polynomial never change from negative to positive. 


In the next two chapters, we will examine the complex numbers and provide a proof of the 
Fundamental Theorem of Algebra. The complex numbers are constructed from the real 
numbers, so an understanding of the real numbers is very important for an understanding of 
the complex numbers. Certainly, Corollary 4.9 is an important and useful tool in our study of 
the roots of polynomials. However, its proof relies on the Intermediate Value Theorem that, in 
turn, relies on the Least Upper Bound Property. Therefore, Corollary 4.9 is an example of an 
important theorem with applications to algebra whose proof is not of a purely algebraic 
nature. 


It is also the case that any proof of the Fundamental Theorem of Algebra will require some 
tools that are not of a purely algebraic nature. The proof we will provide in Chapter 6 is 
relatively easy and uses several standard facts from one- and two-variable calculus. Proofs of 
the Fundamental Theorem of Algebra that rely almost entirely on algebraic tools are more 
difficult, but they also need to use Corollary 4.9. 


Exercises for Sections 4.1 and 4.2 


In exercises 1-6, write each of the quotients as a decimal expansion that eventually repeats. 


1. 
2 
3 
4, 
5 
6 
In exercises 7-13, write each decimal as a quotient of positive integers. 

7. 62.525252525252525252525 - - - 

8. 23.47444444444444444444... 

9. 853.23939393939393939393939 - - - 


10. 5676.767676767676767676767 - - - 
11. 2.4678567856785687567856785 - - - 


114 Chapter 4 


12. 78538.53853853853853853853 - - - 


13. Leta, b be relatively prime positive integers. Show that the decimal expansion of | 
terminates if and only if the set of prime divisors of b is a subset of {2, 5}. 


For exercises 14—17, please first read the following: 


Ordinarily, when doing arithmetic, we work in base 10. In fact, the word decimal indicates 
that we are working in base 10, as the word has Latin roots meaning “dealing with tenths.” 
However, it remains the case that regardless of the base, those real numbers that can be 
written as a quotient of integers correspond to those whose expansion eventually repeats. 
(Notice that if we are not in base 10, we can no longer refer to the expansion as a decimal 
expansion.) In the next four exercises, you will work in different bases and convert repeating 
expansions into quotients of integers. 


14. In Base 2, write 0.111111111111--- as a quotient of positive integers. 
15. In Base 3, write 0.111111111111--- as a quotient of positive integers. 
16. In Base 5, write 2.323232323232323232323--- as a quotient of positive integers. 


17. In Base 7, write 2.323232323232323232323 --- as a quotient of positive integers. 


In exercises 18—23, find the sum of the geometric series. 
18. 7293 (3)" 

19. T7233 (§)" 

20. Dro 3 (34)" 

21. neo 6(7r)" 

22. Ype-26( 7)" 


— n 
23. Yn 6 (77) 
In exercises 24—27, use the Intermediate Value Theorem to show that each polynomial has at 
least three real roots. 


24. ~—6x4+3 
D5. baa 
26. x —8x+6 


27. »—30x+5 


The Rational Numbers and the Real Numbers 115 


For exercises 28-31, please first read the following: 


In calculus, it follows from Rolle’s Theorem that the number of roots of a polynomial is at 
most one more than the number of roots of its derivative. In exercises 28-31, use this 
observation along with your work in exercises 24—27 to show that each of the following 
polynomials has exactly three real roots. 


28. x —6x4+3 
29. »—4x+2 
30. x —8x+6 


31. x» —30x+5 


32. Use the Intermediate Value Theorem and the algorithm after Corollary 4.8 to 
approximate ~/ 13 to five decimal places. 


33. Use the Intermediate Value Theorem and the algorithm after Corollary 4.8 to 
approximate +/37 to five decimal places. 


34. Use the Intermediate Value Theorem and the algorithm after Corollary 4.8 to 
approximate </'11 to five decimal places. 


35. Use the Intermediate Value Theorem and the algorithm after Corollary 4.8 to 
approximate «/43 to five decimal places. 


36. Use the Intermediate Value Theorem and the algorithm after Corollary 4.8 to 
approximate the real root of 2x? + 5x — 11 to five decimal places. 


37. U 


Jse the Intermediate Value Theorem and the algorithm after Corollary 4.8 to 
approximate the two real roots of 3x+ — 4x — 9 to five decimal places. 


38. Use Corollary 4.9 to prove the following: if a is a real number and n is a positive odd 
integer, then there exists a real number c such that c” = a. 


39. Prove that if a > 0 is a real number and n is a positive even integer, then there exist 
exactly two real numbers c such that c” = a. 


40. For each set, find the least upper bound (if it exists). In each case, if there is a least upper 
bound, state whether it is an element of the set. 
(a) {xe R|x* <5} 


(b) {xe R]x* <5} 
(c) {xe R|x?>5} 
(d) {xe Q|x*? <5} 


116 Chapter 4 


41. 


42. 


43. 


(ec) {xe Q|x? <5} 
(f) {xe Z| x? <5} 
(g) {xeN| x? <5} 
(h) {xe N]|x?>5} 


For each set, find the least upper bound (if it exists). In each case, if there is a least upper 
bound, state whether it is an element of the set. 
(a) {xeR|x> < 10} 


(b) {xe R]|x*> < 10} 


(c) {xER|x*> > 10} 
(d) {xe Q] x < 10} 
(e) {xe Q] x3 < 10} 
(f) {xe Z|x* < 10} 
(g) {xeN|x> < 10} 
(h) {xe N|x* > 10} 


For each set, find the least upper bound (if it exists). In each case, if there is a least upper 
bound, state whether it is an element of the set. 
(a) {xER| x? —3x?+2x <0} 


(b) {xe R| x3 —3x?4+2x <0} 


(c) {xE€R|x?—3x?+2x > 0} 
(d) {xe Z| x? —3x?+2x <0} 
(e) {xe Z| x3—3x*2+2x <0} 
(: (xe Z |x? —3x* +27 = 0} 


For each set, find the least upper bound (if it exists). In each case, if there is a least upper 
bound, state whether it is an element of the set. 


(a) (V3, V47) 
(b) [V3, V47] 
(c) (V3, V47] 
(d) [V3, /47) 


The Rational Numbers and the Real Numbers 117 


(e) (—00, 47) 

(f) (V3, 00) 

(g) {xe Q\ xe (V3, V47)} 
(h) {xe Q|xe[Vv3, V47]} 
(i) {xEeN|xe (V3, V47)} 


44. Use the Well Ordering Principle to show that if S is a nonempty subset of N with an 
upper bound, then the least upper bound of S must be an element of S. 


45. If M={a+bV2| a,b € Q}, show that M does not satisfy the Least Upper Bound 
Property. In other words, show that there exists a nonempty subset S of M such that S 
has an upper bound but the least upper bound of S' does not belong to M. 


For exercise 46, please read the following: 


The set N is certainly a subset of the set of positive rational numbers, Qt, as well as of the set 
of polynomials with integer coefficients, Z[x]. Therefore, in some sense, we can think of Qt 
and Z[x] as being “larger” than N. However, in the exercises that follow Section 3.5, 
exercises 48 and 51 show that there exist injections f : Q* > N and h: Z[x] — N. Observe 
that the range of f is a subset of N and f is a bijection between Q* and the range of f. 
Therefore, in some sense, we can think of a copy of Q* as living inside of N. Therefore, 
although N is a subset of Q*, there is some justification for thinking of N and Qt as having 
the same size. Since N is a subset of Z[x], yet there is an injection from Z[x] to N, we can also 
think of N and Z[x] as having the same size. 


Let us now consider the open interval (0, 1) that consists of all real numbers between 0 and 1. 

We can ask if there exists an injection v: (0, 1) > N? If the answer is no, then we could think 
of the set (0, 1) as being larger than N. It would then be logical to assert that the set R is larger 
than the sets N, Q*, and Z[x]. The goal of exercise 46 is to prove that no injection exists from 
(0,1) toN. 


46. Suppose v: (0, 1) > N is an injection. We may think of every element a € (0, 1) asa 
decimal expansion .@|@2...a@,.... In order to describe an element of (0, 1), for every 
n €N, we need to say which element of the set {0, 1,2, 3, 4,5, 6, 7, 8, 9} is in the nth 
decimal place. 
We will define the element a = .aja2...a,--- € (0, 1) as follows: 
(1) ifm eN and nis not in the range of v, then we let a, = 1; 


(2) ifn eN andn = v(), where 6 € (0, 1) and the nth decimal place of £ is not 1, then 


Oy, = 1; 


118 Chapter 4 


(3) ifn eN andn = v(B), where 6 € (0, 1) and the nth decimal place of £ is 1, then 
a, = 2. 

Let m = v(q@) and then examine the mth decimal place of a. If you can obtain a 

contradiction, you will have shown that no injection v: (0, 1) > N can exist. 


4.3 Equivalence Relations 


Earlier in this chapter, we pointed out that there is a second approach that can be used to 
examine the rational numbers and real numbers. In this approach, instead of viewing the 
rational numbers as a subset of the real numbers, we construct the rational numbers directly 
from the integers and then construct the real numbers directly from the rational numbers. As 
mentioned earlier, for a more thorough discussion of this approach, one should look at a book 
on real analysis. However, some of the ideas and techniques used in this approach reappear 
frequently throughout algebra, and it will be helpful to become acquainted with these ideas at 
this time. 


We are accustomed to viewing rational numbers as quotients like 


7 11 36 3 —26 
23° 22 56" 21s 


Therefore, it would be natural to try to define the rational numbers as all ordered pairs of 
integers where the second number is not zero. In other words, we would be looking at 


{(a,b)|aEZ,beEZ, and b#O0}. 


However, the situation is a little more complicated. We consider the quotients x and ; to 
represent the same number. More precisely, we consider the quotients ¢ and 4 to be the same 
precisely when ad = bc. In light of this, in constructing the rational numbers from the integers, 
we need to consider the ordered pairs (a, b) and (c, d) to be the same object precisely when 
ad = bc. Therefore, using this approach, we can define the rational numbers as the set 


Q={(a,b)|aeZ,beZ, and b#0}, 


where (a, b) and (c, d) are considered to be the same object precisely when ad = bc. 


Having defined the rational numbers in this way, we next need to describe how to add and 
multiply rational numbers. Recall, when we viewed rational numbers as quotients, we 
defined addition and multiplication as 


Bs Gree 
bd bd 


and 


(4) 


SI 
Qo 
Ss 
Q 


The Rational Numbers and the Real Numbers 119 


As a result, now that we are viewing elements of Q as ordered pairs, we should define 
addition and multiplication as 


(a,b) +(c,d)=(ad+bc,bd) and (a,b)-(c,d) = (ac, bd). 


At this point, there is a very subtle point that is usually glossed over. Let’s look at two 
addition problems. If we apply the rule in (4) for the addition of rational numbers, 
we see that 


42 


7 d oe 
— and —-+-=—. 
9 4 36 


6 
In this procedure, we consider 5 and § to be two names for the same object. Similarly, we 
consider 5 and ; to be two names for ee same object. Therefore, when we add 3 2 to 5 me need 
to get the same pas as when we add § and 4 > in We first case, we obtain this answer i, and 
in the second case, #2. Fortunately, the ran eek ra Z and + are two names for the same 
object, since 7-36 = 6-42. The concern we now have is, nie the same rational number can 
have many different names, how do we know all of these names yield the same answer when 
used in addition and multiplication problems? 


At first glance, you might consider this concern to be somewhat silly. Your reaction might be 
that obviously 5 and 7 are the same number, so we will get the same answer regardless of 
which we use in an addition or multiplication problem. In our previous and more intuitive 
approach, we viewed the rational numbers as being a subset of the real numbers. In that 
approach, every rational number corresponds to a decimal that eventually repeats. Therefore, 
there is no difference between 4 and ‘ as they both correspond to the decimal .5. However, in 
this more rigorous approach, we are constructing the rational numbers from the integers and 
then constructing the real numbers from the rational numbers. Therefore, we cannot use 
decimal expansions or properties of the real numbers to justify statements about the rational 
numbers. As a result, it is necessary to prove that addition and multiplication problems give us 
the same answer regardless of which ordered pair we use to represent a rational number. To 
make things clearer, it may be helpful to contrast our situation to the following: 


m Example 


Let T={(a,b)|a€Z,beEZ, and b 0}, where (a, b) and (c, d) are considered to be the 
same object precisely when ab = cd. Therefore, in this example, (10, 1), (5,2), and 

(—2, —5) are three different names for the same object. Let us now define addition in T 
exactly the same way as we did in Q. Thus, 


(a,b)+ (c,d) =(ad+bc,bd) and (a,b)-(c,d)= (ac, bd). 


120 Chapter 4 


Observe that 
(10, 1) + 10, 1) = 10-1+1-10, 1-1) = (20, 1) 
and 
(5,2) +6, 2) = (5-24+2-5,2-2) = (20, 4). 


Also note that since 20-1 4 20-4, it follows that (20, 1) 4 (20, 4). Therefore, in this 
example 


(10, 1) = (5,2) but (10, 1)+ (0, 1) 4 (5,2) +65, 2). 


Clearly this is a problem. If (10, 1) and (5, 2) represent the same object, then replacing 
one by the other should make no difference in an addition problem. However, we 
showed that replacing (10, 1) by (5, 2) did indeed change the answer. The problem is that 
we are no longer adding “numbers” but are adding “classes.” By this we mean than 

(10, 1) and (5, 2) are merely two members of a class of ordered pairs that also includes 
(2,5), (1, 10), (—10, —1), (—5, —2), (—2, —5), and (—1, —10). When one writes down a 
rule for addition like (a, b) + (c, d) = (ad + be, bd), you are giving a formula for the 
addition of two ordered pairs. But there is no guarantee that if we replace one of the 
ordered pairs by another member of its class, we will get the same answer. In order for 
addition and multiplication to be well defined, it must be the case that all members of 
the same class give us the same answer in addition and multiplication problems. 

As a result, addition in the set T is not well defined. In the exercises, you will be asked to 
show that multiplication in T is well defined. Furthermore, you will also be asked to show 
that if we instead define addition as 


(a,b) + (c, d) = (ab+cd, 1), 


then addition in T is now well defined. 


The previous example indicates that when we define something in terms of a collection of 
classes, it is not automatic that addition and multiplication are well defined. Therefore, 
we will now prove that addition and multiplication in Q are well defined. 


Proposition 4.10. Addition and multiplication in Q are well defined. 
Proof. Let (a,, b,) and (a2, bz) be two members of one class and let (c;, d;) and (c2, dz) be 


two members of a second class. Therefore, a) b2 = bj az and cid2 = d,c2. We will first show 
that addition is well defined. Thus, we must show that (a1, b1) + (ci, d1) belongs to the same 


www-.elsevierdirect.com 


The Rational Numbers and the Real Numbers) 121 


class as (a2, b2) + (C2, d2). The rule for addition in Q tells us that 
(a1, b\) + (C1, a1) = (aid) + bic), bid)) 
and 
(a2, bz) + (C2, d2) = (dzdz + b22, b2d2). 


It is easy to check that if n is a nonzero integer, then the ordered pairs (a, b) and (an, bn) 
belong to the same class, so we can say that (a, b) = (an, bn). Therefore, if we multiply both 
terms in (ajd, + bic), bid) by b2dz and multiply both terms in (a2d2 + b2c2, brd2) by bid, 
we obtain 


(a1, b1) + (C1, d1) = (ayd, +1}, by d)) = (ay b2d\ dz + by byc1d2, bj d\b2d2) 


and 
(a2, br) + (C2, d2) = (a2d2 + b2€2, byd2) = (ab, d\ dz + by b2ce2d, bd, b2d2). 


We can replace azb, by a,b2 and c2d; by cd in the equations, and this results in 
(a2, bz) + (c2, dz) = (agb1 d\ dz + bi byc2d1, bj d\bzdz) = 
(a, b2d\ dz + by bc \ do, bd, b2dz) = (a, bi) + (C1, 1). 


Therefore, (a@;, by) + (c1, d,) and (do, bz) + (c2, dz) do indeed belong to the same class, so 
addition in Q is well defined. 


The algebra is somewhat easier when we show that multiplication is well defined. In this case, 
we must show that (a1, b;)- (ci, d,) belongs to the same class as (a2, b2) - (C2, dz). The rule for 
multiplication in Q tells us that 


(a1, by) -(c1, 4) = (ayce1, bjd)) and (az, b2)- (c2, do) = (a2c2, b2d2). 


To determine if (a,c,, bjd,) and (a2C2, bydz) belong to the same class, we need to check if 
a,c, b2dz = b,d\azc2. Using the facts that a, bz = by az and cyd2 = d,c2, we see that 


aici bd = ajb2-c\d2 = bia +d\c2 = bi djarcp. 


Thus, (a1, 1) -(c1, d1) and (ao, bz) - (cz, dz) belong to the same class, and multiplication in Q 
is well defined. 


The concept of dealing with a class of objects as opposed to a single object is an important 
theme which runs throughout mathematics. The idea behind this is that there are objects in 
mathematics that may not be identical but, in certain circumstances, can be thought of as 


122 Chapter 4 


equivalent. The ordered pairs (1, 2) and (2, 4) are not identical but, in the context of 
constructing the rational numbers, can be thought of as equivalent. For an everyday example, 
cash and credit cards are not identical. If you are arriving at a tollbooth, you would rather have 
cash, whereas if you are trying to rent a car, you would rather have a credit card. However, in 
many contexts, such as for most purchases in stores or restaurants, having cash is equivalent to 
having a credit card. 


For another example, suppose we consider two days in November to be equivalent if they land 
on the same day of the week. In this example, the Ist, 8th, 15th, 22nd, and 29th of the month 
are all equivalent to each other. Similarly, none of these days are equivalent to the 2nd, but the 
2nd, 9th, 16th, 23rd, and 30th of the month are all equivalent. When two objects are, in a 
certain context, considered to be equivalent, we use the symbol ~. Therefore, in this example, 
we have 


1~8, 8~1, 25~25, 10~17, 21~ 14, 21~7. 


If two objects are not considered to be equivalent, we use the symbol 7%. Therefore, in this 
example, we have 


144, 10#20, 2548, 229. 
We can now formalize this concept. 


Definition 4.11. Given a set S, we say that ~ is an equivalence relation if it satisfies 
(a) Reflexive Property: For every x € S,x ~ x. 
(b) Symmetric Property: For every x,y € S, ifx ~ y, then y ~ x. 


(c) Transitive Property: For every x, y,z € S, ifx ~~ y and y ~ z, then x ~ z. 


Equivalence relations can be defined even more formally as certain types of subsets of S x S. 
However, for our purposes, a more formal approach is not needed. 


m= Examples 


In each of these examples, you should convince yourself that ~ satisfies the reflexive, 
symmetric, and transitive properties. 
1. Given the set Z, define ~ as x ~ y precisely when 2 divides x — y. 


2. Given the set Z, ifn is a fixed positive integer, define ~ as x ~ y precisely when n 
divides x— y. 


3. Given the set Q, define ~ as x ~ y precisely when x — y Is an integer. 


The Rational Numbers and the Real Numbers 123 


4. Given the set R, define ~ as x ~ y precisely when x” = y”. 


5. Given the set R[x], which is all polynomials with real coefficients, define 
F(x) ~ g(x) precisely when f(0) = g(O). 


We now consider several examples where ~ is not an equivalence relation. 


m Examples 


In each of these examples, you should convince yourself that ~ satisfies exactly two of the 
three properties needed to be an equivalence relation. 
1. Given the set Z, define ~ as x ~ y precisely when x < y. This example is reflexive 

and transitive, but it is not symmetric. For example, 1 ~ 2 but 2 # 1. 


2. Given the set R, define ~ as x ~ y precisely when |x — y| < 1. This example is 
reflexive and symmetric, but it is not transitive. For example, 2~ 1 and 1 ~ 0 but 


240. 


3. Given the set Q, define ~ as x ~ y precisely when xy > 0. This example is symmetric 
and transitive, but it is not reflexive. In particular, 0 ~ 0. 


Remember, our reason for looking at equivalence relations is that, as was the case when we 
constructed the rational numbers, we will often need to look at classes of objects instead of 
individual objects. In light of this, it is now natural to introduce the concept of equivalence 
classes. 


Definition 4.12. Let S be a set with equivalence relation ~. If x € S, then we define the 
equivalence class of x, denoted as [x], as 


[x]={yeS|y~ x}. 
In other words, [x] consists of all those elements in S that are equivalent to x. 


Let us now return to some of our previous examples. 


Examples 


1. Let T={(a,b)|a¢Z,beZ, and b £0} and define ~ as (a, b) ~ (c, d) precisely 
when ad = be. This is the equivalence relation used to construct the rational 
numbers, and before continuing, you should convince yourself that ~ is indeed an 
equivalence relation. In this case, [(a, b)] consists of all ordered pairs that yield 


www.elsevierdirect.com 


124 Chapter 4 


fractions that are equivalent to re For example, [(1, 2)] consists of all ordered pairs 
that yield fractions that are equal to 5 and some of the elements in [(1, 2)] are (2, 4), 
(—3, —6), (501, 1002), and (—231, —462). 


2. Let ~ be the equivalence relation defined as x ~ y precisely when 2 divides x — y. 
In this case, there are only two equivalence classes: 


(0, +2, +4,+6,+8,...} and {+1,+3,+5,+7,+9,...}. 


We can certainly think of one class as being the even integers and the other as the 
odd integers. It is important to note that each class has an infinite number of names. 
For example, [0], [2], [1988], and [—9876] are four of the infinite number of different 
names for one of the classes. Similarly, [1], [1951], and [—1955] are three of the 
infinite number of different names for the other class. 


When dealing with the equivalence classes of an equivalence relation, it will always be the 
case, as it is preceding, that any two classes you choose either have no elements in common 
or they are the same class, perhaps with different names. In the previous example, the 
equivalence classes partition Z into two disjoint pieces: the evens and the odds. It will always 
be the case that the equivalence classes of an equivalence relation partition a set into disjoint 
pieces. We record these observations as 


Proposition 4.13. Let S be a set with equivalence relation ~. If |x] and [y] are two 
equivalence classes, then either [x] [Ly] = % or [x] = Ly]. Since every element of S 
belongs to an equivalence class, this says that the equivalence classes partition S into 
disjoint pieces. 


Proof. Suppose [x] and [y] are two equivalence classes such that [x] Ny] 4 @; we now need 
to show that [x] = [y]. To this end, let z € [x] N[y]; therefore, z ~ x and z ~ y. Applying both 
the symmetric and transitive properties, we first see that x ~ z and y ~ z and then see that 

x ~ yand y ~ x. Now suppose a € [x]; thus, a ~ x and x ~ y and transitivity implies that 
a~ y. As aresult, a € [y], and we see that [x] € [y]. 


In the other direction, suppose b € [y]. Thus, b ~ y and y ~ x, so transitivity tells us that b ~ x. 
As aresult, b € [x], and we see that [y] C [x]. Hence, [x] = [y]. 


In order to conclude the proof, we only need to show that every element of S is in some 
equivalence class. So far, this proof has only used the symmetric and transitive properties. To 
finish the proof, we will use the reflexive property. If x € S, we need to show that x belongs to 
some equivalence class. However, the reflexive property tells us that x ~ x, so x € [x], as 
desired. 


www-.elsevierdirect.com 


The Rational Numbers and the Real Numbers 125 


So far you have seen two approaches to the rational numbers. We first viewed them as a subset 
of the real numbers but later viewed them as equivalence classes of ordered pairs of integers. 
You may wonder what is the purpose of the second approach, since it appears to be much more 
abstract and complicated than the first. The reason we introduced the second approach is that 
there are some problems beneath the surface with the first approach. Recall that we needed to 
first assume that we had an intuitive understanding of real numbers as decimal expansions. For 
example, we can think of 


J/2 = 1.4142135623730950488016887242097 - - - 
and 


3 = 1.73205080756887729352744634150587---. 


This raises some questions as to how we do basic arithmetic? In particular, how do we express 
numbers like /2+ /3, /2-/3, or a as decimals? In ordinary arithmetic, we perform 


operations like addition and multiplication from right to left. By that we mean that if we were 
computing 315+ 721, we would (a) first add the 5 and 1 in the one’s column, then (b) move 
to the left and add the | and 2 in the ten’s column, and then (c) move to the left and add the 3 
and 7 in the hundred’s column with part of the answer carrying over to the thousands column. 
The problem is, whereas we usually start basic arithmetic operations at the far right, a decimal 
expansion can continue on forever to the right. As a result, it is not clear how to add and 
multiply numbers whose decimal expansions go on forever. 


In an attempt to resolve this problem, we can think of a decimal expansion as a sequence of 
rational numbers. In other words, we can think of /2 and /3 as the following sequences: 


J/2 = 1,14, 1.41, 1.414, 1.4142, 1.14121, 1.414213, 1.4142135, 1.41421356, ... 
V3 = 1, 1.7, 1.73, 1.732, 1.7320, 1.73205, 1.732050, 1.7320508, 1.73205080, .... 


We can now easily generate a sequence that represents the sum or product of /2 and V3 by 
adding or multiplying on a term-by-term basis. For example, the fifth term of the sequence for 
J2+ V3 would be 1.4142 + 1.7320, since 1.4142 and 1.7320 are the fifth terms in the 
respective sequences representing /2 and /3. As a result, we now have 


xf D+ /3= 2, 3.1, 3.14, 3.146, 3.1462, 3.14626, 3.146263, 3.1462643,... 
and 


V2-/3 = 1, 2.38, 2.4393, 2.449048, 2.4493944, 2.449482403,.... 


Hopefully, it is now becoming clearer why we introduced a second approach to viewing the 
rational numbers. On an intuitive level, it might have been easier to simply view the rational 


126 Chapter 4 


numbers as a subset of the real numbers. However, this intuitive approach ignored some 
serious problems regarding how to do basic arithmetic with decimal expansions. As we just 
illustrated, these problems can be dealt with by viewing real numbers as sequences of rational 
numbers. Therefore, from a logical viewpoint, it makes sense to construct the rational numbers 
from the integers and then to construct the real numbers from the rational numbers. Thus, our 
second approach to the rational numbers, despite being more abstract, is needed. Before 
concluding this discussion, there are two more points to deal with. 


We would like to consider real numbers as sequences of rational numbers. In this way, it 
becomes possible to perform ordinary arithmetic operations with real numbers. However, how 
do we handle sequences like 


0,1,0,1,0,1,0,1,0,1,0,1,0,1,...? 


Certainly this sequence is a legitimate sequence, and it can be added to or multiplied with 
other sequences. But this sequence cannot possibly be used to represent a real number as it 
oscillates forever between two different numbers. Similarly, the sequence 


1,2,3,4,5,6,7,8,9, 10, 11,12, 13,... 


cannot represent a real number as it increases without bound. Your impulse might be to say 
that we should only consider those sequences of rational numbers which converge. If so, you 
are on the right track, but, once again, another technical difficulty arises. Think about a 
sequence we used to represent /2 like 


1, 1.4, 1.41, 1.414, 1.4142, 1.14121, 1.414213, 1.4142135, 1.41421356,.... 


We know that this sequence converges as it converges to 2. However, we now run the danger 
of using some circular reasoning. We are at the point in our reasoning where we know that the 
rational numbers exist, and we are trying to use them to give a formal definition of the real 
numbers. We want to say that the preceding sequence represents 2, since the terms converge 
to /2. However, we cannot say that the preceding sequence converges until we know that 

the real number V2 exists, yet we cannot say that the real number V2 exists until we know 
that the sequence converges. You should certainly spend some time thinking about this 
problem before continuing. 


The bottom line is that we need a way to describe those sequences that we think represent real 
numbers without using the concept of convergence. Recall from calculus that if a sequence 
converges, then the limit need not be one of the terms of the sequence. In particular, /2 does 
not appear as one of the terms in the sequence we used to represent it. To resolve our problem, 
we need a concept that only refers to the terms in the sequence and doesn’t refer to an object, 
like the limit, which may not be one of the terms of the sequence. The concept that resolves 
this problem is Cauchy sequences. 


The Rational Numbers and the Real Numbers 127 


Definition 4.14. Let 
{an}poy = 41, 42, 43, a4,d5,... 


Se, & ; . 
be a sequence. We say {an}7~_, is a Cauchy sequence if for every € > 0, there exists a natural 
number N, such that whenever n,m > N, we have |dy — Am| < €. 


Intuitively, being a Cauchy sequence means that eventually all the terms of the sequence 
become close to each other and stay as close to each other as you desire. In the exercises, you 
will be asked to show that every sequence that converges, using the definition of convergence 
from calculus, is a Cauchy sequence. Therefore, it now appears that we can now consider the 
real numbers to be the set of all Cauchy sequences of rational numbers. 


However, we have one more technical point to deal with. Consider the following sequences: 
0,0, 0,0, 0, 0,0, 0,0, 0,0,0,0,0,... 


and 


1111111172 «21 «21 «21~2 
BAS OT 8 O10! 11D 138 a 


Both of these sequences should represent the number 0. Similarly, the sequences 


3456789 10 11 12 13 14 
3 AS Or 8 OO AT 12? 3 


and 


12345678 9 10 11 12 13 
23 A $76, 7 89 10) 11" 12413" 14 


should both represent the number 1. In light of this, several different sequences can represent 
the same real number. This may sound like a familiar situation, as earlier we saw that different 
ordered pairs can represent the same rational number. The solution, as before, is to define an 
appropriate equivalence relation and then to look at the equivalence classes. Intuitively, if two 
sequences have the same limit, then we should consider them to be equivalent. However, as 
mentioned before, our reasoning would be circular if, in our attempt to construct the real 
numbers, we defined the equivalence relation in terms of limits. Instead, we need to define an 
equivalence relation which only refers to the terms of the sequences. 


Definition 4.15. Let T be the set of Cauchy sequences of rational numbers. Define the 
equivalence relation ~¢s as {an} °°, ~cs {bn} 72, precisely when for every € > 0, there exists 
some natural number N, such that whenever n > N, we have |ay — by| < €. 


128 Chapter 4 


You should convince yourself that not only is ~,; an equivalence relation but that if two 
Cauchy sequences of rational numbers have the same limit, using the ordinary definition of 
limit from calculus, then they will be equivalent. Therefore, we can now consider the real 
numbers R to be the set of equivalence class of Cauchy sequences of rational numbers using 
the equivalence relation ~,;. Throughout mathematics, we will continue to define certain sets 
as the equivalence classes under a particular equivalence relation. As we did for the rational 
numbers, we should always check that if an algebraic object is defined in terms of equivalence 
classes, then operations like addition and multiplication are well defined. In the exercises, 

you will be asked to show that addition and multiplication of real numbers are indeed well 
defined using the definition above. 


Exercises for Section 4.3 


In exercises 1-8, ~ will be defined on Q. You will need to determine which of the reflexive, 
symmetric, and transitive properties are satisfied by ~ on Q. Whenever a property fails, 
provide an example that illustrates its failure. 


1. ~ is defined as x ~ y whenever x — y is an integer 
~ is defined as x ~ y whenever x + y is an integer 
~ is defined as x ~ y whenever xy is an integer 


~ is defined as x ~ y whenever x? > y* 


2 

3 

4 

5. ~ is defined as x ~ y whenever > ae 
6. ~ is defined as x ~ y whenever xy > 0 

7. ~ is defined as x ~ y whenever xy > 0 

8. ~ is defined as x ~ y whenever xy < 0 


In exercises 9-16, ~ will be defined on Z[x], the set of polynomials with integer coefficients. 
You will need to determine which of the reflexive, symmetric, and transitive properties are 
satisfied by ~ on Z[x]. Whenever a property fails, provide an example that illustrates its 
failure. 


9. ~is defined as f(x) ~ g(x) whenever f(3) — g(3) =0 
10. ~ is defined as f(x) ~ g(x) whenever f(3) — g(3) € {-1, 1} 
11. ~ is defined as f(x) ~ g(x) whenever f(3) — g(3) € {—1, 0, 1} 
12. ~ is defined as f(x) ~ g(x) whenever f(3) — g(3) <0 


13. 
14. 
15. 
16. 


The Rational Numbers and the Real Numbers 


~ is defined as f(x) ~ g(x) whenever f(3) — g(3) < 0 
~ is defined as f(x) ~ g(x) whenever f(5)-9(5) =0 
~ is defined as f(x) ~ g(x) whenever f(5)-9(5) > 0 
~ is defined as f(x) ~ g(x) whenever f(5)-9(5) <0 


129 


In exercises 17-24, ~ will be defined on R[x], the set of polynomials with real coefficients. 


You will need to determine which of the reflexive, symmetric, and transitive properties are 
satisfied by ~ on R[x]. Whenever a property fails, provide an example that illustrates its 


failure. 
17. ~ is defined as f(x) ~ g(x) whenever f(c) < g(c), forall c € R. 
18. ~ is defined as f(x) ~ g(x) whenever f(c) < g(c), for some c € R 
19. ~ is defined as f(x) ~ g(x) whenever f(c)-g(c) = 0, for allc € R. 
20. ~ is defined as f(x) ~ g(x) whenever f(c)- g(c) > 0, for allc ER. 
21. ~ is defined as f(x) ~ g(x) whenever f(c) — g(c) € Q, forallc ER. 
22. ~ is defined as f(x) ~ g(x) whenever f(c) — g(c) € Q, for some c € R. 
23. ~ is defined as f(x) ~ g(x) whenever f(c) — g(c) € Z, for allc ER. 
24. ~ is defined as f(x) ~ g(x) whenever f(c) — g(c) EN, forall c e€ R. 
25. For the set Z, define ~ as a ~ bif and only if a—b is divisible by 5. 
(a) Show that ~ is an equivalence relation on the set Z. 
(b) How many equivalence classes are there? Briefly explain. 
(c) How many elements are in each equivalence class? Briefly explain. 
(d) List five elements of the equivalence class [23]. 
(e) List five elements of the equivalence class [—19]. 
26. For the set Z, define ~ as a ~ b if and only if a— b is divisible by 7. 


(a) Show that ~ is an equivalence relation on the set Z. 

(b) How many equivalence classes are there? Briefly explain. 

(c) How many elements are in each equivalence class? Briefly explain. 
(d) List five elements of the equivalence class [11]. 


(e) List five elements of the equivalence class [—11]. 


130 Chapter 4 


27. 


28. 


29. 


30. 


Let T be the set of positive rational numbers and then define ~ as a ~ b if and only if 
a=b-2", for some n € Z. (For example, 5 ~ 10 as 5 = 10-2-!, whereas 5 # 11, since 
5411-2”, forall n € Z.) 

(a) Show that ~ is an equivalence relation on the set 7. 


(b) How many equivalence classes are there? Briefly explain. 
(c) How many elements are in each equivalence class? Briefly explain. 


(d) List five elements in the equivalence class [4]. 


(e) List five elements in the equivalence class [7]. 


Let T be the set of positive rational numbers and then define ~ as a ~ b if and only if 
a= b-3", for some n € Z. (For example, 21 ~ i as 21 = i 37, whereas 21 # 42, since 
21 442-3”, for alln € Z.) 

(a) Show that ~ is an equivalence relation on the set 7. 


(b) How many equivalence classes are there? Briefly explain. 
(c) How many elements are in each equivalence class? Briefly explain. 


(d) List five elements in the equivalence class [33]. 


(e) List five elements in the equivalence class [2]. 


Let R[x] be the set of polynomials with real coefficients and then define ~ as 
f(x) ~ g(x) if and only if f’(x) = g’(x). 
(a) Show that ~ is an equivalence relation on the set R[x]. 


(b) How many equivalence classes are there? Briefly explain. 

(c) How many elements are in each equivalence class? Briefly explain. 
(d) List five elements in the equivalence class [x]. 

(e) List five elements in the equivalence class [6x* — 2x +8]. 


Let R[x] be the set of polynomials with real coefficients and then define ~ as 
f(x) ~ g(x) if and only if f” (x) = g(x). 
(a) Show that ~ is an equivalence relation on the set R[x]. 


(b) How many equivalence classes are there? Briefly explain. 
(c) How many elements are in each equivalence class? Briefly explain. 
(d) List five elements in the equivalence class [x]. 


(e) List five elements in the equivalence class [3x* — 6]. 


31. 


32. 


33. 


34. 


The Rational Numbers and the Real Numbers 131 


Let Q[x] be the set of polynomials with rational coefficients and then define ~ as 
F(x) ~ g(x) if and only if f(1) = gC) and f(—1) = g(—1). 
(a) Show that ~ is an equivalence relation on the set Q[x]. 


(b) How many equivalence classes are there? Briefly explain. 

(c) How many elements are in each equivalence class? Briefly explain. 
(d) List five elements in the equivalence class [1]. 

(e) List five elements in the equivalence class [x]. 


Let Q[x] be the set of polynomials with rational coefficients and then define ~ as 
f(x) ~ g(x) if and only if f(V2) = g(Vv2). 


(a) Show that ~ is an equivalence relation on the set Q[x]. 

(b) How many equivalence classes are there? Briefly explain. 

(c) How many elements are in each equivalence class? Briefly explain. 
(d) List five elements in the equivalence class [2]. 

(e) List five elements in the equivalence class [x]. 


In this exercise, we revisit the set T that appeared in the example preceding 

Proposition 4.10 and show that the multiplication of equivalence classes is well defined. 
When we were first introduced to the set T, we had not yet defined equivalence classes. 
Therefore, we will now restate the problem more formally in terms of equivalence 
relations and equivalence classes. Let T = {(a, b) |a € Z,b € Z, b £0} and define ~ on 
T as (a, b) ~ (c, d) whenever ab = cd. Next, we define the multiplication of equivalence 
classes as 


[(a, b]] -[(c, d)] = [(ae, ba)], 


for all equivalence classes [(a, b)], [(c, d)]. Prove that the multiplication of equivalence 
classes is well defined. 


Using the set and equivalence relation in exercise 25, define the addition and 
multiplication of equivalence classes as 


[a]+[b]=[a+b] and [a]-[b] = [abd], 


for all [a], [Db]. 
(a) Show that the addition of equivalence classes is well defined. 


(b) Show that the multiplication of equivalence classes is well defined. 


132 Chapter 4 


35; 


36. 


37, 


38. 


Using the set and equivalence relation in exercise 26, define the addition and 
multiplication of equivalence classes as 


[a]+[b]=[a+b] and [a]-[b] = [ab], 
for all [a], [b]. 
(a) Show that the addition of equivalence classes is well defined. 
(b) Show that the multiplication of equivalence classes is well defined. 


For the set R, define ~ as x ~ y whenever |x| = ||. You may assume that ~ is an 
equivalence relation. Next, define the addition and multiplication of equivalence 
classes as 


[x]+[y]=[x+y] and [x]-[y] =[xy], 
for all [x], [y]. 
(a) Show that the multiplication of equivalence classes is well defined. 


(b) Give an example that illustrates that the addition of equivalence classes is not well 
defined. 


For the set Q[x], define ~ as f(x) ~ g(x) whenever f(x) — g(x) is a constant. You may 
assume that ~ is an equivalence relation. Next, define the addition and multiplication of 
equivalence classes as 


Lfo]+[g@)]=Lf@)+e)] and [f(x)]-[g@)] =Lf@)g@)], 
for all [ f(x), [g(x)]. 


(a) Show that the addition of equivalence classes is well defined. 


(b) Give an example that illustrates that the multiplication of equivalence classes is not 
well defined. 


For the set Z[x], define ~ as f(x) ~ g(x) whenever f(5) — g(5) is a multiple of 3. 
You may assume that ~ is an equivalence relation. Next, define the addition and 
multiplication of equivalence classes as 


[f(x] +[g@)] =(f)+e)] and [f(x)]-[g@)] =[f@)e@)], 
for all [ f(x), [g(x)]. 


(a) Show that the addition of equivalence classes is well defined. 


(b) Show that the multiplication of equivalence classes is also well defined. 


39. 


40. 


41. 


42. 


The Rational Numbers and the Real Numbers 133 


For the set Z[x], define ~ as f(x) ~ g(x) whenever f(8) — g(8) is a multiple of 7. You 
may assume that ~ is an equivalence relation. Next, define the addition and 
multiplication of equivalence classes as 


[f(x)]+[g@)]=(fx)+e(%)] and [f(x)]-[g@)] = lf@)g)], 
for all [ f(x), [g(x)]. 


(a) Show that the addition of equivalence classes is well defined. 
(b) Show that the multiplication of equivalence classes is also well defined. 


For the set Z[x], define ~ as f(x) ~ g(x) whenever the f(x) and g(x) have the same 
leading coefficient. You may assume that ~ is an equivalence relation. Next, define the 
addition and multiplication of equivalence classes as 


[f(x)]+[g@)]=(fx)+e(%)] and [f(x)]-[g@)] = lf@)g)], 
for all [ f(x), [g(x)]. 


(a) Give an example that illustrates that the addition of equivalence classes is not well 
defined. 


(b) Show that the multiplication of equivalence classes is well defined. 


For the set Z, define ~ as a ~ b whenever a — b is divisible by 10. You may assume that 
~ is an equivalence relation and may also assume that the addition and multiplication of 
equivalence classes is well defined where we define addition and multiplication as 


[a]+[b]=[a+b] and [a]-[b] = [ad], 


for all [a], [Db]. 
(a) Find a positive integer s such that [s]+ [7] = [0]. 


(b) Find a positive integer t such that [t] + [4] = [2]. 
(c) Find a positive integer u such that [v]-[7] = [1]. 
(d) Find a positive integer v such that [v]- [7] = [9]. 
(e) Find a positive integer w such that [w]- [8] = [0]. 


For the set Z, define ~ as a ~ b whenever a — b is divisible by 12. You may assume that 
~ is an equivalence relation and may also assume that the addition and multiplication of 
equivalence classes is well defined where we define addition and multiplication as 


[a]+[b]=[a+b] and [a]-[b] =[ab], 


for all [a], [Db]. 


134 Chapter 4 


43. 


44. 


45. 


46. 


(a) Find a positive integer s such that [s]+[5] = [0]. 
(b) Find a positive integer ¢ such that [t]+ [8] = [3]. 
(c) Find a positive integer u such that [uv] - [7] = [1]. 
(d) Find a positive integer v such that [v] - [7] = [2]. 
(e) Find a positive integer w such that [w]- [9] = [0]. 


For the set R, define ~ as a ~ b whenever a — b is an integer. You may assume that ~ is 
an equivalence relation. Next, define the addition and multiplication of equivalence 
classes as 


[a]+[b]=[a+b] and [a]-[b] = [ad], 
for all [a], [b]. 


(a) Show that the addition of equivalence classes is well defined. 


(b) Give an example that illustrates that the multiplication of equivalence classes is not 
well defined. 


(c) Find a positive real number s such that [s]+ [/2] = [0]. 
(d) Find a positive real number ¢ such that [t] + [7] = [0]. 


For the set Q(./2) = {u+vV2 | u, v € Q}, define ~ as a ~ b whenever a—b € Q. You 
may assume that ~ is an equivalence relation. Next, define the addition and 
multiplication of equivalence classes as 


[a]+[b]=[a+b] and [a]-[b] = [ad], 
for all [a], [Db]. 


(a) Show that the addition of equivalence classes is well defined. 


(b) Give an example which illustrates that the multiplication of equivalence classes is 
not well defined. 


(c) Find a positive number s € Q(v2) such that [s] + [V2] = [0]. 
(d) Find a positive number t € Q(/2) such that [¢] +[2+3/2] = [0]. 


Write down the first five terms of three different Cauchy sequences of rational numbers 
that can be used to represent the number 15. 


Write down the first five terms of three different Cauchy sequences of rational numbers 
that can be used to represent the number —9. 


The Rational Numbers and the Real Numbers 135 


47. 


48. 


Write down the first five terms of three different Cauchy sequences of rational numbers 
that can be used to represent the number 13. You may want to refer to the answer to 
exercise 32 that appears after Section 4.2. 


Write down the first five terms of three different Cauchy sequences of rational numbers 
that can be used to represent the number 37. You may want to refer to the answer to 
exercise 33 that appears after Section 4.2. 

In exercises 49-54, you will need to use the following approximations of 5 and /7: 


V5 © 2.2360679774997:-- and V7 ~ 2.6457513110645---. 


Your answers from exercise 49 will then be used in exercises 50-54. 


49. 


50. 


51. 


22: 


53. 


54. 


Write down the first ten terms of Cauchy sequences of rational numbers that can be used 
to represent af 5-and-</7: 


Write down the first five terms of a Cauchy sequence of rational numbers that can be 
used to represent af § ea/7. 


Write down the first five terms of a Cauchy sequence of rational numbers that can be 
used to represent PST. 


Write down the first five terms of a Cauchy sequence of rational numbers that can be 
1 
used to represent Wik 


Write down the first five terms of Cauchy sequences of rational numbers that can be used 
to represent 2/5 and 3/7. 


Use your answer from exercise 53 to write down the first five terms of Cauchy sequences 
of rational numbers that can be used to represent 2/5 +37. 


In exercises 55-60, you will need to use the following approximations of 6 and 10: 


V6 © 2.4494897427831--- and V10*3.1622776601683---. 


Your answers from exercise 55 will then be used in exercises 56-60. 


55. 


56. 


57. 


Write down the first ten terms of Cauchy sequences of rational numbers that can be used 
to represent /6 and 10. 


Write down the first five terms of a Cauchy sequence of rational numbers that can be 
used to represent J6+/10. 


Write down the first five terms of a Cauchy sequence of rational numbers that can be 
used to represent /6- /10. 


136 Chapter 4 


58. 


59. 


60. 


61. 


62. 
63. 


64. 
65. 
66. 


Write down the first five terms of a Cauchy sequence of rational numbers that can be 
aD 
used to represent Te 


Write down the first five terms of Cauchy sequences of rational numbers that can be used 
to represent 3/6 and —2/10. 


Use your answers from exercise 59 to write down the first five terms of a Cauchy 
sequence of rational numbers that can be used to represent 3/6 — 2/10. 


Let {a,}°°_, be a convergent sequence of real numbers. Prove that {ay }P° , is a Cauchy 


sequence. 
Let {b, }°°., be a Cauchy sequence of real numbers. Prove that {b,}°° , is bounded. 


Prove that ~,;, as defined in Definition 4.15, is an equivalence relation on the set of 
Cauchy sequences of rational numbers. 


Show that the sum of any two Cauchy sequences is also a Cauchy sequence. 
Show that the product of any two Cauchy sequences is also a Cauchy sequence. 


Let E be the collection of equivalence classes from exercise 63 and define the addition 
and multiplication of equivalence classes as 


[anda a [{on}1 | = [{an + bh | 


and 


aay a ; bale | _ [{anbn}ee, | ’ 


for all Hante | : [{dn 5, Note that in these definitions we are using the facts from the 
previous exercise that the sum and product of any two Cauchy sequences are also 
Cauchy sequences. 

(a) Show that the addition of equivalence classes is well defined. 


(b) Show that the multiplication of equivalence classes is well defined. 


The Complex Numbers 


In our examination of the real numbers, we saw that R is large enough to contain a root of 
every polynomial of odd degree. However, R also has the property that the square of each 
element is greater than or equal to zero. Therefore, there are many polynomials of even degree, 
such as x? + 1 and x* + 2x?+9, which have no roots in R. In light of this, to find the roots of 
all polynomials with real coefficients, we will need a number system larger than R. This leads 
us to a study of the complex numbers C. 


In this chapter, we will define the complex numbers C and examine some of its basic 
properties. This will lead us to examples of the three most important objects of study in 
abstract algebra: fields, commutative rings, and groups. Obtaining experience with examples 
of fields, commutative rings, and groups at this early stage will make it much easier for you to 
develop a deeper understanding of them when you reexamine them in much greater detail in 
Chapters 8, 15, and 17. In the next chapter, we will take a more concrete and geometric view 
of the complex numbers. That chapter will culminate in a proof of the Fundamental Theorem 
of Algebra. Hopefully, the combination of approaches used to examine C in these two chapters 
will make the complex numbers as real and important to you as the real numbers. 


5.1 Complex Numbers 


There are many different ways to introduce the complex numbers. We have chosen the one that 
makes computations the easiest. To get started, let’s look at what happens if we add and 
multiply polynomials with real coefficients of degree at most one. Applying the associative 
and commutative laws, we see 


(a+ bx) +(c+dx) = (at+c)+(b+d)x 
and 
(a+ bx): (c+dx) = ac + (ad + be)x + bdx’, 
for alla, b,c,d €R. 
In defining the complex numbers C, we will let i be a symbol that behaves the same way as x 


does in the preceding computations, with one big exception. Whenever the term i” appears, 


Copyright © 2010 by Elsevier Inc. All rights reserved. 1 3 7 


138 Chapter 5 


we replace it by the number —1. Therefore, if we were to replace x by i in the calculation, the 
term bdi* would become —bd. This leads us to 


Definition 5.1. The complex numbers C are all objects of the form a+ bi, where a,b € R. 
Addition and multiplication in C are defined as: 


(a+ bi) +(c+di) =(a+c)+(b+d)i and (a+bi)-(c+di) = (ac — bd) + (ad + be)i, 


for alla,b,c,d ER. 


m Examples 
(3-+4i) + (-8+7i) = -—5+11i, 
(-6+ (1—V2)i) + G14 V2i) = 25 +i, 
(2+i)-(—3+6i) = (2-(—3) —1-6)+ (2-64 1-(—3))i = —12+ 91, 
(4+ 7i)-(4—Ti) = 4-4—-7-(-1))+ 4-(-1I4+7-Di=65. 


When looking at a complex number a+ bi, where a, b € R, we often refer to a as the real 
part and bi as the complex part. In fact, whenever we write a complex number a@ in the form 
a+ bi, we will always be assuming, unless explicitly stated otherwise, that a,b ¢ R. Asa 
shorthand, we will write complex numbers like 2+ Oi as 2, 0 —3i as —3i, and —5+ li as 

—5 +i. In particular, we see that every real number a can be viewed as the complex number 
a+0i. Observe that in C, the sum and product of a+ 0i and c+ 0i are, respectively, 
(a+c)+0i and ac+0i. Thus, when we view real numbers as living inside of C, they give us 
the same sums as products as they did in R. 


When doing a computation by hand, it is often easier to simply treat i the way we treated the 
variable x and then replace i? by —1 at the end of the problem. For example, we have 


(5+5i)* = (5 +51) -(5+5i) = 25 +50i + 251? = 25 + 50i — 25 = SO. 


Previously we mentioned that the polynomials x? + 1 and x*+2x?+9 have no roots in R. 
However, it should now be clear to you that i and —i are both roots of x? +1. Furthermore, 
you should now apply the rules for addition and multiplication in C and check, by hand, that 
14+/2i, -14+ J/2i, 1 —/2i, —1 — /2i are all roots of the polynomial x4 42x7+49, 


At some point in your previous courses, you were probably introduced to the complex 
numbers. Based on your experiences with the complex numbers, some questions you might 
have are: 


The Complex Numbers 139 


(a) The complex numbers are often referred to as the imaginary numbers. Are the complex 
numbers an imaginary object or are they as real and legitimate as other number systems 
like Z, Q, and R? 


(b) Are there ways to view complex numbers as concretely as we view real numbers? 


One of the goals of Chapters 5 and 6 is to convince you that the complex numbers are every bit 
as real and legitimate an object as the real numbers. In fact, we will see that constructing the 
complex numbers from the real numbers is much simpler and more straightforward than 
constructing the real numbers from the rational numbers. 


Over the years, an imprecise and misleading use of language has evolved regarding the 
complex numbers. The set R of real numbers can be used to represent the points on a number 
line. Therefore, the distance between any two points can be represented by an element of R. 
This perspective helps us to view the elements of R as very concrete and very “real.” As a 
result, the set R is called the real numbers. This led to the somewhat natural but very 
inaccurate and misleading impression that the only numbers that can be considered as “real” 
objects are those that belong to R. Therefore, people began to informally think of any number 
that did not belong to R as not being real and to say that it was imaginary. Thus, we began to 
refer to complex numbers as imaginary numbers. 


However, as we will see in Chapter 6, there are also very concrete ways of viewing the 
complex numbers. In fact, in the exercises after Section 5.2, we will see that complex numbers 
can be used to study practical problems such as the opposition to the flow of electricity in 
electrical circuits. Thus, complex numbers can and should be considered as “real” as real 
numbers. But unfortunately, the term imaginary number remains popular. The negative effect 
of this term is that it leads students to believe that complex numbers don’t really exist. As a 
result, many students have the false impression that complex numbers are merely esoteric, 
abstract objects invented by mathematicians. Sadly, students often fail to appreciate that 
complex numbers are as important and as legitimate as integers, rational numbers, and 

real numbers. 


In mathematics and other walks of life, it is often difficult to become comfortable with new 
ideas. There was a time when people had difficulty accepting the concept of the number 

zero. The feeling was that adding zero was the same as adding nothing and nothing could not 
exist. Therefore, they concluded that the number zero could not exist. Not surprisingly, it also 
took people a long time to accept that negative numbers were as valid a concept as positive 
numbers. 


Today, we have no trouble accepting the fact that /2 is a number that cannot be written as a 
quotient of integers. However, there was a time when people felt that ./2 did not exist, as they 
could not accept that there existed numbers that could not be written as a quotient of integers. 


140 Chapter 5 


In light of this, it is not surprising that many people are initially uncomfortable with the more 
difficult concept that there exist numbers whose square is —1. However, in time, the complex 
numbers have been accepted as a number system that is as important and legitimate as Z, Q, 
and R. 


5.2 Fields and Commutative Rings 


During our examination of Z, Q, and R, we have been tossing around the term number system. 
Actually, there is no formal definition for number system. However, it is reasonable to stop and 
look at some of the important properties that the sets of numbers Z, Q, and R all have in 
common. Then we will check and see how many of these properties are also satisfied by C. 


Properties of Q and R 
1. Associative Law of Addition: For every x, y,z, (xty)+z=x+(y+z). 


2. Additive Identity: There is an element, usually denoted as 0, such that x +0 =x =0+x, 
for every x. 


3. Additive Inverses: For every x there is an element, usually denoted as —x, such that 
x+(—x) =0= (—x)+x. 


4. Commutative Law of Addition: For every x, y, x + y= y+x. 
5. Associative Law of Multiplication: For every x, y, z, (xy)z = X(yZ). 


6. Multiplicative Identity: There is an element, usually denoted as 1, such that 
x-l=x=1-x, for every x. 


7. Multiplicative Inverses: For every x other than 0, there is an element, usually denoted as 


-1, 


x7! such that x-x~! =1=x7!-x. 


8. Commutative Law of Multiplication: For every x, y, xy = yx. 
9. Distributive Laws: For every x, y, Zz, X(y+zZ) =xy+xz and (x+ y)z=xz+ yz. 


As we will soon see, C also satisfies properties 1-9. Since C satisfies the same nice algebraic 
properties as Q and R, we can consider C to be an algebraic object as legitimate and worthy of 
study as Q and R. A recurring theme throughout mathematics is that when many different sets 
satisfy a common collection of properties, we give a special name to the sets that satisfy those 
properties. Since many sets in addition to Q, R, and C satisfy properties 1-9, it is natural to 
give a name to those sets that have these properties in common. 


However, before using these examples to motivate the definitions of fields and commutative 
rings, observe that if a, b € Z, thena+b,a-be€ Z. Similarly, if c,d €¢ Qande, f ER, 
thenc+d,c-d¢eQande+ fe-f €R. Thus, according to Definition 2.12, addition and 


The Complex Numbers 141 


multiplication are both binary operations on the sets Z, Q, and R. Therefore, yet another 
important property satisfied by Z, Q, and R is that they are closed under addition and 
multiplication. We now have the following. 


Definition 5.2. A set F where addition and multiplication are binary operations satisfying 
properties I—9 is called a field. 


Note that we did not include Z with Q, R, and C when mentioning some of the sets that satisfy 
the nine properties satisfied by fields. To see the reason for this, let us consider various 
elements of Z, such as 2, 15, —3. The multiplicative inverses of these elements are, 
respectively, 5. i: —i. However, none of these three multiplicative inverses is in Z. In fact, 
the only integers whose multiplicative inverses are also in Z are 1 and —1. In light of this, Z 
does not satisfy property 7. In particular, for a set S to satisfy property 7, it is not enough for 
the nonzero elements of S to have multiplicative inverses in a larger set. Those multiplicative 
inverses must belong to S. There are many other sets that, like the integers, satisfy 

properties 1—6, 8, and 9 but may not satisfy property 7. We now give a special name to those 


sets that satisfy these properties. 


Definition 5.3. A set R where addition and multiplication are binary operations satisfying 
properties 1-6, 8, and 9 is called a commutative ring. 


Note that if R is a commutative ring, then R may satisfy, but is not required to satisfy, 
property 7. Some commutative rings, like Q and R, satisfy property 7, whereas others, like Z, 
do not. In particular, every field is a commutative ring, but not every commutative ring 

is a field. 


For some additional examples of commutative rings that are not fields, let Z[x], Q[x], and 
R[x] denote, respectively, all polynomials with coefficients in Z, Q, and R. Observe that 
when multiplying nonzero polynomials in Z[x], Q[x], and R[x], degrees never go down. In 
particular, if we multiply the polynomial p(x) = x by other nonzero polynomials in Z[x], 
Q[x], and R[x], the result is always a polynomial of degree at least one. In Z[x], Q[x], and 
R[x], the constant polynomial g(x) = 1 is the identity element of multiplication. However, in 
Z{x], Q[x], and R[x], it is impossible to multiply x by another polynomial to obtain the 
polynomial 1. Thus, p(x) = x does not have a multiplicative inverse in Z[x], Q[x], or R[x]. 
In fact, this argument really tells us that any polynomial of degree at least one does not have 
a multiplicative inverse in Z[x], Q[x], or R[x]. As a result, Z[x], Q[x], and R[x] are 
commutative rings that are not fields. 


The previous examples can be generalized to allow us to produce even more examples of 
commutative rings that are not fields. If R is a commutative ring, then the set R[x] of 
polynomials with coefficients in R is also a commutative ring. This tells us that every 
time we adjoin a new variable to a commutative ring, we obtain a new commutative ring. 


142 Chapter 5 


In multivariable calculus, one studies the set R[x, y] of all polynomials in two variables with 
real coefficients. By our previous observation, R[x, y] is also a commutative ring. More 
generally, if R is any commutative ring, the set R[x), x2, x3, ..., Xn] of polynomials inn 
variables, where n can be any natural number, is also a commutative ring. In fact, if we allow 
the number of variables to be infinite and consider the set R[x1, x2, x3, ...], we again obtain a 
commutative ring. However, in all of these examples, there are many polynomials that do not 
have multiplicative inverses, so all of these examples fail to be fields. 


We can now turn our attention back to C and begin the work necessary to show that it is 
a field. 


Lemma 5.4. C is a commutative ring. 


Intuition. As one becomes more adept at doing proofs, one needs to develop the skill of 
understanding not only what needs to be proved, but also what may be assumed in the proof. 
As we prove that C satisfies properties 1-6, 8, and 9, we will be assuming that R already 
satisfies these properties. Although the computations needed to show that C inherits these 
properties from R may occasionally become tedious, the thinking involved is always very 
straightforward. This is because the construction of C from R is much more straightforward 
and concrete than either the construction of Q from Z or the construction of R from Q. Recall 
that every element of Q is not merely a pair of integers but an equivalence class of pairs of 
integers. Even more involved is the construction of R from Q as equivalence classes of 
Cauchy sequences of elements of Q. Since both Q and R are defined in terms of equivalence 
classes, it took some work to justify that addition and multiplication in Q and R were well 
defined. On the other hand, every element of C corresponds to one and only one ordered pair 
of elements of R. We do not need to use equivalence classes, and we know immediately that 
addition and multiplication are well defined. Thus, moving from an understanding of R to an 
understanding of C is, in many ways, much easier than making the analogous move from Z to 
Qor from Qto R. 


It will be fairly quick and easy to show that C inherits properties 1-4, 6, and 8. Showing that 
C inherits properties 5 and 9, although straightforward, will require a reasonable amount of 
calculating and bookkeeping. Before doing the actual proof, we will work through some 
examples to gain experience with the type of calculations needed to prove facts about the 
complex numbers. As you read along, you should do these calculations with paper and pencil. 
It is important to note that, in these examples, we will be illustrating that properties 1-6, 8, 
and 9 hold in these special cases but that these examples do not constitute a proof that they 
hold in all cases. 


Let us consider the special case where 


a=34+2i, B=5-i, y=7+4+8i. 


The Complex Numbers 143 


In this case,a+ 6 =8-+i, so (a+ B)+y = 15+ 9. Furthermore, 6+ y = 12+7i, so 

(a+ B)+y = 154+ 9i. Thus, (a+ B)+y = (a+ B)+y. Also, B+a=8+i,soa+B= Bb+a. 
These examples illustrate (but do not prove) the associativity and commutativity of addition 
in C. Furthermore, 0+ 0i has the property that 


a+(0+0i) = (64+2i1)+ (0+0i) = 8+0)+(24+0)i=3+2i=a 
and 
(0+0i)+a= (0+ 01) + 342i) = (0+3)4+(04+2)i=3+2i =a. 


This illustrates that 0+ 07 has the property that belongs to the additive identity of C. If we 
consider the complex number —3 — 2i, it has the property that 


a+ (—3— 27) = (34+2i)+ (—3 —2i) = 0+ 0i = (-—3 — 21) + 342i) = (—3 — 21) +a. 
Thus —3 — 27 is the additive inverse of a. 


For the properties concerning multiplication, we have a- 6 = 17+ 7i, thus (@- B)- y= 

63+ 185i. Furthermore, 6-y = 43 + 33i, hence a- (6- y) = 63+ 185i. Thus (@- B)-y = 
a-(B-y). Also observe that B-a = 17+ 7i, hence a- 6 = B- a. These examples illustrate (but 
do not prove) the associativity and commutativity of multiplication in C. Note that the 
complex number 1 + 0i has the property that 


a- (1407) = (342i)-(4+01) =34+2i=a=3+2i= (14+07)-(3+2i) = (14+ 02) -a. 
This illustrates that 1 + 07 has the property belonging to the multiplicative identity of C. 
To illustrate one distributive law, we have 

a-(B+y) = (34+2i)-(5—-i)+(7+8i)) = (34 2i)-2+7i) = 22+45i 

and 

a:-Bta-y= (3421): (5-1) +64+2i)-(7+ 81) = 17+7i) + 5+ 381) = 224+ 457. 
Therefore,a-(B+y)=a-Bta-y. 
To illustrate the other distributive law, we have 

(a+ B)-y=(64+21)+(56—-i1))-7+81) = (8+1)-(74+ 81) = 484-711 

and 

a-y+B-y=(34+2i)-(74+ 81) + (5-1) - (74+ 81) = (5+ 387) + (434 337) = 484-711. 


Therefore (w+ B)-y=a-y+ By. 


144 Chapter 5 


When reading the proof of Lemma 5.4, note how frequently we use the fact that R is a 
commutative ring. Although the proof is somewhat long and tedious, all the steps are 
straightforward. 
Proof of Lemma 5.4. Let a, B, y € C, then we can write 
a=a+bi, B=c+di, y=e+t fi, 
where a, b,c, d, e, f € R. To check that addition is associative, we have 
(a+ B)+y = (at bi) + (c+di)) + (e+ fi) = (atc) + (b+ di) + (e+ fi) = 
(1) ((at+c)+e)+((b+da)+ f)i 
and 
a+(B+y)=(at+bi)+ ((c+di)+ (et fi)) = (at+bi)+ ((c+e)+(d+ f)i) = 
(2) (a+lete) b+ d+ fr. 
However, since addition in R is associative, 
(at+c)t+e=a+(ct+e) and (b+d)+f=b+(d+f). 
Referring back to (1) and (2), we see that 
(a+ p)+y=at+ (Bry). 
As a result, the associativity of addition in R tells us that addition in C is also associative. 
To show that C has an additive identity, consider the complex number 0 + Oi. We have 
a+(0+0i) = (a+ bi) + (0+ 01) = (a+0)+ (b+0)i=a+bi=a 
and 
(0+0i)+a= (0+ 01)+ (a+ bi) = (0+a)+(04+D)i=a+bi=a. 


Thus, 0+ 0i is the additive identity of C. Technically speaking, 0+ 0i is the appropriate way of 
writing the additive identity of C. However, if no confusion arises, we use the shorthand 0 for 
the additive identity element in a commutative ring. 


Next, given a = a+ bi, consider the complex number —a+ (—b)i. We now have 
a+ (—a-+ (—b)i) = (a+ bi) + (—a+ (—b)i) = (a— a) + (b— b)i = 04+ 01 
and 
(—a+(—b)i) +a = (—a+ (—b)i) 4+ (a+ bi) = (—a+a)+(—b+b)i=0+0i. 


Thus, —a+ (—b)i is the additive inverse of a = a+ bi. We usually use the shorthand —a — bi 
for —a+ (—b)i and also let —a@ denote the additive inverse of a. 


The Complex Numbers 145 


To show that addition in C is commutative, observe that 

a+ B= (at+bi)+(c+di) =(a+c)+(b+d)i 
and 

Bta=(c+di)+(at+bi) =(c+a)+(d+Db)i. 


Since addition in R is commutative, a+c=c+aandb+d=d-+b, which tells us that 
a+ $= +a. Thus, the commutativity of addition in R tells us that addition in C is also 
commutative. 


Now, consider the complex number 1 + Oi. We now have 

a- (14+ 07) = (a+ bi): (14+ 01) = (a-1—b-0)4+-(4-0+b-li=at+bi=a 
and 

(1+0i)-a= (1+ 0i)- (a+ bi) = 1 -a—0-b)+(1-b4+0-ai=a+biz=a. 


Therefore, 1 + Oi is the multiplicative identity. Once again, if no confusion arises, we will 
usually denote the multiplicative identity of a commutative ring as 1. 


At this point, we have shown that C satisfies properties 1-4 and 6. To show that multiplication 
in C is associative, we observe that 


(a- B)-y = ((a+ bi) - (c+ di))- (e+ fi) = ((ac — bd) + (ad + be)i) - (e+ fi) = 
(3) ((ac — bd )e — (ad + bc) f ) + ((ac — bd) f + (ad + be)e)i 
and 

a-(B-y) = (a+bi)-((c+di)- (e+ fi)) = (a+ bi): ((ce—df) + (cf +de)i) = 
(4) (a(ce — df) — b(cf + de)) + (a(cf +de) + b(ce— df ))i. 


However, since addition and multiplication in R are both associative and addition in R is 
commutative, we see that both (ac — bd)e — (ad + bc) f and a(ce — df) — b(cf + de) are equal 
to ace — bde — adf — bcf . Similarly, we also see that both (ac — bd) f + (ad + bc)e and 

a(cf +de)+b(ce—df) are equal to acf — bdf + ade-+ bce. Plugging these facts into (3) 

and (4), shows us that 


a-(B-y)=a-(B-y). 
Thus, several properties of R combine to show that multiplication in C is associative. 
To see that multiplication in C is commutative, observe that 


a-B =(a+bi)-(c+di) = (ac—bd)+ (ad+be)i 


146 Chapter 5 


and 
B-a=(c+di)-(a+bi) = (ca— db) + (cb+da)i. 


Since both addition and multiplication are commutative in R, we see that ac — bd = ca— db 
and ad + bc = cb + da. Plugging these facts into the preceding equations immediately 
shows us that a- 6 = B-a. Once again, C inherits properties from R, and we see that 
multiplication in C is commutative. 


Finally, we need to show that C satisfies the distributive laws. For one distributive law, 
observe that 


a-(B+y) = (a+ bi): ((c+di) + (e+ fi) = (atbi)- (e+e + dt foi) = 
(a(c+e)—b(d+ f))+ (ad+ f—t+b(c+e))i 
and 
a-B+a-y =(a+bi)-(c+di)+(at+bi)-(e+ fi) = 
((ac — bd) + (ad + bc)i) + ((ae — bf) + (af + be)i) = 
(ac — bd + ae — bf) + (ad +bce+af + bedi. 


Using several properties of R, including the distributive laws, it follows that a(c + e) — 
b(d+ f) =ac —bd+ae—bf anda(d+ f)+b(c+e) =ad+bc+af + be. Plugging 
these facts into the preceding equations shows us that a- (6+ y) =a-S+a-y. Thus, C 
satisfies one of the two distributive laws. 


To show that C satisfies the second distributive law, we can save a good deal of work if we 
apply the first distributive law and the commutativity of multiplication. Applying these laws 
results in 


(a+ B)-y=y-(@at+BP=y-aty- Bp=a-y+ By, 


as desired. 


In the preceding proof, we saw that by using the commutativity of multiplication, each 
distributive law easily follows from the other. Therefore, as long as multiplication is 
commutative, it is indeed redundant to list both distributive laws. However, it is very 
common in mathematics to study sets with addition and multiplication that satisfy all the 
properties of a commutative ring, except that multiplication may not be commutative. Such 
objects are known either as rings or noncommutative rings. 


If you have taken a class in linear algebra, then you are familiar with matrix multiplication. If 
you go back and look at the properties satisfied by the addition and multiplication of the set of 
2 x 2 matrices, you will see that they are an example of a noncommutative ring. In such sets, 


The Complex Numbers 147 


where multiplication is not commutative, then having one distributive law hold does not 
automatically imply that the other distributive law holds. For this reason, we usually mention 
both distributive laws even though it is only necessary to mention one of them for 
commutative rings. 


In light of Lemma 5.4, all that remains to complete the proof that C is a field is to show that 
every nonzero element of C has a multiplicative inverse. As in the proof of Lemma 5.4, we 
will make strong use of properties of R. 


Theorem 5.5. C is a field. 


Proof. Having proven in Lemma 5.4 that C is a commutative ring, it only remains to show that 
every nonzero element of C has a multiplicative inverse in C. Since 0+ 0i is the additive 
identity element of C, if a= a+ bi is a nonzero element of C, then a 4 0+ 0i, so at least one 
of a or b is nonzero. Thus, a? + b” #0. 


Since a* + b* has a multiplicative inverse in R, we can consider the complex number 


a b.. 
P= 21m gape 
We now have 
a p= (a+b ( ae e ') = 
a+b? a+b 


a n b? 4 (a, ab), ath" 4; , 
i= ~=1. 
a+b? a+b? e+b? a+b? a2 +b? 
Since multiplication in C is commutative, we also have 6-a = 1. Thus, 6 is the multiplicative 
inverse of a, so C is a field. 


m Examples 


+i t= 


~. 
~ 


3—4i) 1 = = — + —i, 


12 7 —l 12 eile 
(ati) - 193 : 193 j= 
318) BY + Gh) GB) + Gs) 


193 93 


= 0 6. 1, (4\1' 
(6i) = ——; at = Si, = 
0*+67 0*+67 6 11 4 


148 Chapter 5 


Exercises for Sections 5.1 and 5.2 


In these exercises, whenever your answer is a complex number, try to express it in the form 
a+bi, where a,beR. 


In exercises 1—20, leta=2+43i, B=5—i, y= —3+4+8i, andé= J24+ ni. 


1. 


So 


el 
Be As ob oS 


15. 


16. 


17. 
18. 
19. 
20. 


oo GF ND HM FF YP SN 


Compute a + 8. 

Compute 6a — 48. 

Compute J3y — 116. 

Compute —12y + /76. 

Compute —2a@+58+7y—46. 

Compute 3a — 46 —9y+ 126. 

Compute a - £. 

Use your answer from exercise 7 to compute (a: B)~!. 
Compute a! and 7. 


Use your answer from exercise 9 to compute a~! - 8~!. Compare your answer to your 
answer from exercise 8. 


Compute y - 6. 
Use your answer from exercise 11 to compute (y-5)~!. 
Compute y~! and 6~!. 


Use your answer from exercise 13 to compute y~!-5~!. Compare your answer to your 
answer from exercise 12. 


Compute a~!+ 6—! and compute (a+ £)~! and compare your answers. Can you draw 
any conclusions? 


Compute y~! + 6~! and compute (y + 5)~! and compare your answers. Can you draw 
any conclusions? 


Compute (a@- 8)-y and a- (B- y) and compare your answers. 
Compute (6- y)-6 and 6- (y-45) and compare your answers. 
Compute a- (8+ y) anda-6+a-y and compare your answers. 


Compute (a+ 6)-y and a-y+6-y and compare your answers. 


21. 
22. 
23. 
24. 
25. 
26. 
27. 
28. 
29. 
30. 


31. 


32. 


33. 


34. 


35. 


36. 


37. 


38. 


The Complex Numbers 


Solve for x: (3 —41)x = 7+ 2i. 

Solve for y: (2+ 5i)y = 3 — 15i. 

Solve for t: (6—i)t+ (4—5i) = (-8+3i)t+ (—2 +i). 

Solve for v: (5 — 2i)v+ (7+ 11i) = (-6— 811)v4 (23 — 61). 

Solve for x: x7 —6x+15=0. 

Solve for y: y*? + 12y+50=0. 

Find the roots of the polynomial x” — Six — 4 by completing the square. 


Find the roots of the polynomial x” + 7ix+ 18 by completing the square. 


Solve for z: 27 + (2+ 2i)z+2i = 0. You might want to use the fact that (1 + Do Sr 


Solve for w: w? + (6 —4i)w+ (13 — 127) = 0. You might want to use the fact that 
(3 —2i)* = 13-123. 


Solve for the ordered pair (x, y) in the following system of linear equations: 
(3+i)x+4y=7+4+ 81, 5x—(2+i)y=6. 

Solve for the ordered pair (A, B) in the following system of linear equations: 
5A+ 3-4) B= 17, 9-—41)A+4+ (8) B= 1—2i. 


Check that 1 + af 2i; —1+ /2i, 1- V2i, —-l- /2i, are all roots of the polynomial 
x++2x?4+9 by plugging all four of them into the polynomial. 


Use the substitution t = x” and then the quadratic formula to find the roots of the 
polynomial x* + 2x?+4 9. At first, your four solutions may not be in the form a+ bi, 


149 


where a, b € R. However, check that your four roots are indeed the same four roots as 


in exercise 33. 


Check that 3+ i, 35 i, = Be i, aa/3= i, are all roots of the polynomial 
x+ — 4x? + 16 by plugging all four of them into the polynomial. 


Use the substitution t = x? and then the quadratic formula to find the roots of the 


polynomial x* — 4x? + 16. At first, your four solutions may not be in the form a+ bi, 


where a, b € R. However, check that your four roots are indeed the same four roots as 


in exercise 35. 


Check that 4 — 8i, 4+ 87, 2+ /3i, 2 — /3i are all roots of the polynomial qo de + 


119x* — 376x + 560 by plugging all four of them into the polynomial. 


Check that 3+ 5i, 3 —5i, 7—i, 7+ are all roots of the polynomial x* — 20x3 + 168x? — 


776x + 1700 by plugging all four of them into the polynomial. 


150 Chapter 5 


39. 


40. 
Al. 


42. 


43. 


44. 


45. 


If x, y are nonzero elements of a field F, show that x~!y~! is the multiplicative inverse 
of xy. 
Show that the nonzero elements of a field are closed under multiplication. 


If R is a commutative ring, let U(R) denote the elements in R which have a 
multiplicative inverse in R. Use the ideas in exercise 39 to show that U(R) is closed 
under multiplication. 


If Q[x] denotes the commutative ring consisting of all polynomials with rational 
coefficients, describe those elements in Q[x] that have a multiplicative inverse 


in Q[x]. 


If Z[x] denotes the commutative ring consisting of all polynomials with integer 
coefficients, list all the elements in Z[x] that have a multiplicative inverse 
in Z[x]. 


In Chapter 4, using equivalence relations and equivalence classes, we saw how the 
rational numbers can be constructed from the integers. Using similar ideas, we will now 
construct the set R(x) of rational functions with real coefficients from the set R[x] of 
polynomials with real coefficients. 


Let S = {(a(x), b(x)) | a(x), b(x) € R[x], b(x) 4 O}. Define ~ on S as 
(a(x), b(x)) ~ (c(x), d(x)) precisely when a(x)d(x) = b(x)c(x). 


(a) Show that ~ is an equivalence relation. 


(b) If we define the addition of equivalence classes as 
[(a(x), b(x))] + ec), d))] = [a d(x) + ba)c(x), bx) d(x))], 


show that addition is well defined. 


(c) If we define the multiplication of equivalence classes as 
[(a(x), b(x))] - Le), d(x))] = Mae), b@) da), 


show that multiplication is well defined. 


(d) Show that every equivalence class, other than the equivalence class which contains 
(0, 1), has a multiplicative inverse. 


Let R[[x]] = {ao + ayx +aox* +--+ anx"+---| each a; € R} denote the set of 
power series with real coefficients. In calculus, you are concerned with where a power 


The Complex Numbers 151 


series converges. However, even if one is not concerned with questions of convergence, 
R[[x]] is a commutative ring. If 


x x x ae : 
lt+x+ 2424-4 — 4...) = 


2 6 24 120 


agtayx+ nx" + a3x° + agx' + higher-degree terms, 


where each a; € R, compute do, a, a2, a3, a4. Do you think R[[x]] is a field? 


46. Let Ri[x,x-']] = lag t+a_miix t! ee ee ie eee ie ore ee ae ee 
a,x" +---| each a; € R} denote the set of Laurent series with real coefficients. Laurent 
series are identical to power series except that a Laurent series is allowed to have a finite 
number of terms where the exponents are negative integers. The set R[[x, x—!]] is also a 
commutative ring. If 


nt ee = og 
5. +6. 04> 120 a 


b_ox77 + b_yx7! + bo + by x + box? + bax + higher-degree terms, 
where each b; € R, compute b_2, b_1, bo, b1, bz, b3. Do you think R[[x, x—!]] is a field? 


For exercises 47-54, please read the following: 


As we remarked earlier, the addition of complex numbers is straightforward and is virtually 
identical to the addition of polynomials with real coefficients of degree at most 1. However, 
when we multiply complex numbers, we replace i? by —1 and this causes some people to 
question the validity and relevance of complex numbers. Since we can square an element of 
the complex numbers and obtain —1 as an answer, many students initially view the 
multiplication of complex numbers as an arbitrary piece of algebraic trickery that lacks any 
connection to the real world. However, the multiplication of complex numbers is not only a 
valid algebraic procedure but also has real-world applications, as it can be used to describe the 
opposition to the flow of electricity in electrical circuits. As electricity flows along an 
electrical circuit various objects, such as lightbulbs and toasters, can oppose the flow of the 
electrical current. The total opposition in an electrical circuit is called the impedance and is 
measured in ohms (which are denoted by the symbol &2). The impedance provided by various 
objects along an electrical circuit can be represented by complex numbers. Two common types 
of electrical circuits are series circuits and parallel circuits. 


If we let PGES denote an object along an electrical circuit that opposes the flow of the 
current, then two examples of series circuits are 


152 Chapter 5 
Se PI SSI 


and 
POWER POWER 


Observe that in a series circuit, all the objects that oppose the flow of electricity carry the same 
current. On the other hand, in a parallel circuit, the objects that oppose the flow of electricity 
have the same power source but no longer all carry the same current. Two examples of parallel 


circuits are 


— /\/\/\_ — A/V 
| AAA _] 
J/\/\/\ and 
| AAA _] 
POWER 
POWER 


In addition, an electrical circuit can consist of a combination of series and parallel connections 


such as 


a ANY ANY Ne NV NV NE =P SPP—_ Lf PS 
ee rn ee a a Ss | 
ID MWh_ IW 
POWER 
POWER 


To compute the total impedance of a series circuit is straightforward and does not require the 
multiplication of complex numbers. In particular, if Z; and Zz represent the impedance of two 


objects in the series circuit 


2; 2 
POWER 


then the total impedance of the circuit, denoted as Zr, is given by Zr = Z; + Z2. However, to 
find the total impedance of a parallel circuit does require the multiplication of complex 
numbers. Given the parallel circuit 


153 


The Complex Numbers 


the total impedance of the circuit is given by Z7 = 


2 


2 


ey 2. a ae 


POWER 


Z\:Z2 
Z1+Zp° 


In exercises 47-52, let Z; = 34+ 27 Q, Zo =5—3i Q, and Z3 =24+7i Q. 


47. 


48. 


49. 


50. 


51. 


52. 


24 20 
a AN NY Ne DNV NV 
Given , find Zr. 
POWER 
24 2 23 


23 
Given 


‘ find ZT. 


Given} OO /\/\__], find Zr. 


POWER 
2 22 
ay AN NY Ne ANN 
23 


Given /\/\/\ , find Zr. 


Given |AAW\____ —- /A/W_] , find Zr. 


POWER 


154 Chapter 5 


For exercises 53-54, if a, 6 € C, let o denote the operation wo B = nie 


53. In the circuit 


POWER 


we could find Z; by computing either Z; o Z) or Zz 0 Z;. Observe that we would obtain 
different answers if o was not commutative. In light of this, prove algebraically that the 
operation o is commutative. (You may assume that Z; + Z2 #0.) 


54. In the circuit 


POWER 


we could find Z; by computing either (Z; o Z2) o Z3 or Z; 0 (Z2 0 Z3). Observe that 

we would obtain different answers if o was not associative. In light of this, prove 
algebraically that the operation o is associative. (You may assume that Z; + Z2, Z2+ Z3, 
Z,:Z2+2Z,-Z3+Z2- Z;3 are all nonzero.) 


5.3 Complex Conjugation 

Coming up with the element 6 in the proof of Theorem 5.5 may have seemed like pulling a 
rabbit out of a hat. The complex number PaB = api appeared to come out of nowhere. 
However, ip 
complex conjugation. Not only is complex conjugation extremely useful for studying C, but 
generalizations of it will have applications throughout abstract algebra. 


— api will seem much more natural once we become familiar with 


Definition 5.6. Let x: C > C be the function defined as (a+ bi)* =a-— bi, for alla, be R. 
We call x complex conjugation and if a € C, we call a* the complex conjugate of o. 


The Complex Numbers 155 


Examples 


(2 + 3i)* = 2 —3i, (2—31)* =2+4+3i, (/7+2i)* = V7 — Zi, (7)* =7, (8i)* = —8i, 
(/11)* = 711, (J/11i)* = —V 111i. 


Simply stated, * fixes the real part of w and negates the complex part of a. We can now record 
some of the many properties of *. 


Lemma 5.7. Let « : C — C denote complex conjugation. Then 
(a) x is a bijection. 

(b) Ifa, B EC, then (a+ B)* =a* + B*. 

(c) Ifa, BEC, then (a- B)* = a*- B*. 


Proof. In order to show that * is a bijection, we will first show that it is surjective and will then 
show that it is injective. Let a= a+ bi € C to show that « is surjective; we need to find some 
element of C that, when plugged into «, gives a as the answer. However, 


(a—bi)* =a—(—bi) =a+bi=a, 
thus, a — bi is the desired element. 
To show that * is injective, we must show that if two elements of C give the same answer 
when plugged into *, then those elements must have been equal. To this end, let a = a+ bi, 
B=c+di € C such that a* = p*. Since a* = (a+ bi)* = a—bi and p* = (c+ di)* =c—di, 
the fact that a* = £* tells us that a= c and —b = —d. However, this certainly implies that 
b= d, which immediately tells us that a = £, as desired. Thus, x is a bijection. 


To see that * satisfies property (b), if w= a+ bi and B = c+di, we have 
(a+ B)* = ((a+ bi) + (c+di))" = ((a+c)+ (b+d)i)" = (at+c)—(b+ai= 
(a— bi)+(c—di) = (a+ bi)* + (c+di)* =a* + B*. 


The computation that shows that « satisfies property (c) is only slightly more involved. On the 
one hand, we have 


(a- B)* = ((a+ bi) - (c+di))* = ((ac — bd) + (ad + bc)i)* = (ac — bd) — (ad + bc)i. 
On the other hand, 
a* . B* = (a+bi)*-(c+.di)* = (a—bi)- (c —di) = 
(ac — (—b)(—d)) + (a(—d) + (—b)c)i = (ae — bd) — (ad + be)i. 


Comparing the two sets of equations above, we see that (@- B)* = a* - B*. 


156 Chapter 5 


Although * satisfies many other important properties, we have singled out properties (a), (b), 
and (c) in Lemma 5.7. The reason for doing this is that functions that satisfy these three 
properties are the building blocks of group theory and Galois theory. An understanding of the 
interaction between these types of functions and the roots of polynomials will eventually lead 
us to the proof of the insolvability of the quintic. We now give a name to those functions, like 
complex conjugation, that satisfy the three conditions in Lemma 5.7. 


Definition 5.8. Leto: R— R be a function defined on a commutative ring R. We say that o 
is an automorphism of R if 


(a) o is a bijection, 
(b) o(x+y) =o0(x)+0(y), forall x, y € R, and 
(c) o(x-y)=o(x)-o(y), forall x,ye€ R. 


Mathematical Induction will allow us to extend properties (b) and (c) from Lemma 5.7 and 
Definition 5.8 into an even more useful form. 


Lemma 5.9. Suppose o : R— Ris an automorphism of a commutative ring R. If 
X1,X2,...,Xy, € R then 


O(X1 HQ +++ + Xn) = (41) + O(%2) +++ + On) 
and 
O(X1 + X2+++Xn) = O(X1) -O(X2)-+ On). 
Proof. Let T be those n € N such that, for all x1, x2,...,x,) € R, 
O(X1 + XQ +++ + Xn) = 0(%1) + O(X2) +--+ + On) 
and 
O(X1 + XQ+++ Xn) = O(X1) -O(X2) +++ O(%n). 


It will suffice to show that T = N, and we will do this by Mathematical Induction. 


If we examine what it means to say that 1 € T; it simply means that o(x,) = o(x1), which is 
certainly the case. Therefore, we may now assume that k is some natural number that belongs 
to T, and we need to show that k+ 1 € T. To this end, let x,, x2,..., x4, X41 € R and consider 


O(xy+x2 +++ +x~+XR41) and Ox, XQ +++ XE XK41)- 


The Complex Numbers 157 


To simplify the use of properties (b) and (c) in Definition 5.8, we will let y = x1 +x. +---+x, 
and z = x1 -X2---x,. Since k € T, we have 


O(y) = o(%1 FX2 +++ + XK) = (41) + O(%2) +++ Foxe) 
and 
O(Z) = O(X1 + X2+++ XK) = O(X1) -O(X2) +--+ O(Xx)- 
Therefore, property (b) and the associativity of addition now imply that 
OX HQ HEA XE) = OCH $424 ++ +E) AXED) = 


O(y + XK41) = O(y) OK) = 
(o(x1) +0(X2) ++ ++ +0(XK)) FO(KR+1) = O(%1) + O(%2) +++ + O(XK) + O(%R+1)- 


Similarly, property (c) and the associativity of multiplication now imply that 


O(%1 X20 HK Kerr) = OCCA XQ +++ XK) Keg) = O(Z* Xep1) = O(Z) - oe) = 


(0(X1) »0(X2) +++ O(XK)) » O(XR41) = O(X1) - O(X2) + OCR) O(XR41)- 


Thus, k+ 1 € T, thereby concluding the proof. 


Here are some additional easily checked properties of x. 
Lemma 5.10. Let x :C — C be complex conjugation. If a € C, then 
(a) a* =a ifand only ifa € R; 
(b) a-a*,a*-a,ata*,a*+aeER; 
(c) (a”")* = (a*)", for alln €N; 
(d) (a*)*=a. 
Proof. Let «= a+ bi; then a* = a— bi. For part (a), if a € R, then b = 0. Hence, a and a* are 
both equal to a. On the other hand, if a* = a, then —bi = bi, so b = 0. Thus, a € R. 
For part (b), we have 

a-a* =a*-a=(a°+b?)4+0i=a°+b° ER 
and 

ata*=a*+a=2a+0i=2aeR. 


Thus, a: a*,a*-a,ata*,a*+aeER. 


158 Chapter 5 


For part (c), since * is an automorphism, if we let 


A=XH=xX2S=':' = Xy 


and o = x in the last part of Lemma 5.9, we see that (@”)* = (a*)”, for alln EN. 
Finally, for part (d), 
(a*)* = (a— bi)* =a— (—bi) =a, 


as desired. 


In the proof of Lemma 5.10(b), we saw that if a = a+ bi, then a-a* = a’ +b’. Since 
a+b é€R,ifa + 0), we can consider the complex number 


B 1 : a b, 
— a= _ 1. 
a? + b2 a? + b2 a? + b2 


It now follows that 


1 * 1 * 1 2 2 


Since multiplication in R is commutative, 6 - a = 1, thus £ is the multiplicative inverse of a, 
and we now see where f came from in the proof of Theorem 5.5. 


If you have dealt with complex numbers in the past, you have probably heard the saying that 
complex roots occur in complex pairs. For example, if 3+ 2i is a root of a polynomial with 
real coefficients, then so is 3 — 27. More formally, this really says that if a € C is a root of a 
polynomial p(x) € R[x], then so is a*. We can now prove this using facts about complex 
conjugation. 


Proposition 5.11. Jf a <€ C is a root of a polynomial p(x) with real coefficients, then a* is 
also a root of p(x). 
Proof. Since p(x) € R[x], we know that 
p(x) = dnx”" + dn_1x” | +--+ a,x +ao, 
where each a; is a real number. Plugging a into the p(x), we obtain 
(5) 0=a,0" +a,-10" | ++-++aja+ap, 


The idea behind this proof is to show that applying * to p(a) results in p(a*). Since 
p(a)* = 0* = 0, this will suffice. Therefore, we begin by applying * to both sides of (5) to 
obtain 


O* = (ano” tayo”) +--+» + aya+ap)*. 


The Complex Numbers 159 


Lemma 5.10(a) tells us that 0* = 0. Combining this with the first part of Lemma 5.9, we see 
that 


0 = (ana")* + Gn—10"!)* +++» + (aia)* +45. 
If we next apply property (b) of Lemma 5.7 to each term of the form (a;a')*, we now have 
0=a*(a")* +a*_,(0""')* +--+ ato* +43. 
Since each a; € IR, Lemma 5.10(a) tells us that a; = a;, and so 
0 =a,(a")* +a,_1(a"—!)* +---+aja* +ap. 
Finally, Lemma 5.10(c) tells us that for each i, (a')* = (a*)'. Therefore, we now have 
0 = an(a*)" +4y—1(a*)" | +--+ +410" +40. 


However, the right-hand side of the previous equation is precisely what we get when we plug 
a* into p(x). Thus, p(a*) = 0, as desired. 


m Examples 


1. Suppose 9 — 2i is the root of some polynomial g(x) with real coefficients. 
Proposition 5.11 asserts that g(x) must also have (9 — 2i)* = 9+ 2i as a root. 
Therefore, g(x) must have degree at least 2. Observe that 


(x — (9 —2i))(x — (9 +2i)) = x? — 18x +85 


has real coefficients and has both 9 — 2i and 9+ 2i as roots. Chapter 12 will show 
that g(x) must be a multiple of x7 — 18x+85. 


2. Suppose 4—8i and 2+ /3i are roots of a polynomial p(x) with real coefficients. 
Proposition 5.11 asserts that 


(4—8i)* =448i and (24+ V3i)* =2-V3i 
are also roots of p(x). The polynomial 
(x— (4—81))(x — (44 81))(x— (24+ V3i))(«— (2 — V3i)) = 
KY — 122° +119x* —376x+560 


has real coefficients and has 4 — 87,4 + 8i,2+ VJ3i, and 2 — /3i as roots. It will 
follow from our work in Chapter 12 that p(x) must be a multiple of 
x* — 12x? + 119x? — 376x + 560. 


www.elsevierdirect.com 


160 Chapter 5 


3. On the other hand, consider the polynomial f(x) = x* —5ix—4. You should check 
that i and 47 are roots of f(x). However, you should also check that 


i*=-i and (4i)* =—4i 

are not roots of f(x). Ask yourself: Does this contradict Proposition 5.11? The 
answer is no because f(x) has a coefficient, —5i, that does not belong to R, so 
Proposition 5.11 does not apply. 


When we go back and examine the proof of Proposition 5.11, it appears that the ideas and 
techniques used should apply not only to complex conjugation but also to automorphisms of 
any commutative ring. In an attempt to generalize Proposition 5.11 to other commutative 
rings, we proceed with 


Definition 5.12. Ifa: R— Ris an automorphism of a commutative ring R, define the set R° 
as R° ={reé R|o(r) =7}. 


In algebra or calculus, if we are given a function like f(x) = 3 — x, we can think of 3 as a fixed 
point of f, since f (3) = 3. In our more abstract setting, R° consists of those elements of R 
that are fixed points of 0. Note that Lemma 5.10(a) tells us that the set of fixed points of * in C 
is equal to IR. We can now prove our more general version of Proposition 5.11. 


Corollary 5.13. Leto: R— R be an automorphism of a commutative ring R. If a € Ris a 
root of a polynomial p(x) with coefficients in R°, then o(a) is also a root of p(x). 


Intuition. Certainly, Corollary 5.13 is stated in more general terms than Proposition 5.11. 
However, does Corollary 5.13 actually provide us with information about roots of polynomials 
that we could not obtain from Proposition 5.11? To begin to answer this, let us suppose that 

5 — 3/2 is the root of some f(x) € Q[x]. Proposition 5.11 no longer applies but can we say 
anything about additional roots of f(x)? Similarly, suppose 4— 2/3 +5,/7 is the root of some 
g(x) € Q[x]. Can we say something about additional roots of g(x)? 


It will turn out that Corollary 5.13 will soon enable us to assert that 5+ 3/2 is also a root of 
f(x). In addition, it will enable us to conclude that 4+ 2/3 +5,/7, 4-2/3 —5V7, 
4+42,/3 —5,/7 are also roots of g(x). Thus, Corollary 5.13 can indeed by applied in many 
situations where Proposition 5.11 does not apply. 


The proof of Corollary 5.13 will be virtually identical to the proof of Proposition 5.11. Only 
one new idea is needed for this proof. Other than that, all we will need to do is to go back 
through the proof of Proposition 5.11 and replace « by o, C by R, and R by R®. 


The Complex Numbers 161 


Proof. In the proof of Proposition 5.11 we used the fact that 0* = 0. In order to modify the 
proof of Proposition 5.11 to our more general situation, the only new fact we will need to 
prove about commutative rings is that o(0) = 0. To show this, we will use the following four 
facts that hold in R, since R is a commutative ring: 


(i) Khas an additive identity that is denoted as 0. 
(ii) o(O) has an additive inverse in R that is denoted as —o(0). 
(iii) o satisfies property (b) of Definition 5.8. 
(iv) Addition in R is associative. 
Therefore, we now have 
o(0) = o(0) +0 = o(0) + (a0) — o(0)) = (o(0) + 0(0)) — 00) = 
o(0+0) — o(0) = o(0) — (0) = 0, 


as desired. You should be careful to check that you understand why each and every equality 
holds in the preceding computation. From this point on, the proof of this corollary is simply a 
rewriting of the proof of Proposition 5.11 with o replacing *, R replacing C, and R® replacing 
R. Therefore, we leave it to you to check all the details. 


In order to apply Corollary 5.13 to situations such as where f(x) € Q[x] has 5 —3,/2 as a root, 
we need an easy way to construct commutative rings and automorphisms. Proposition 5.15 
will do this for us, but first we need 


Lemma 5.14. Let R be a commutative ring. 

(a) RK has only one additive identity element and only one multiplicative identity element. 

(b) Each element of R has only one additive inverse. 

(c) Ifxe R, thenx-0=0and0-x=0. 

(d) Ifx€R, then the product (—1)-x is equal to the additive inverse of x. 

Intuition. Some parts of this lemma, at first glance, might strike you as obvious or pedantic. 
In part (a), it may seem completely obvious that there can be only one 0 and only one 1. 
Similarly, in part (c), it might seem clear that multiplication by 0 always results in 0. Although 


it is not hard to prove these facts, they do require proof and the proofs must exploit the 
properties of commutative rings from Definition 5.3. 


Remember that in a commutative ring, 0 and 1 are merely symbols used to represent the 
additive and multiplicative identities. However, the symbols 0 and 1 are not necessarily the 
numbers zero and one that belong to Z, Q, and R. Therefore, it does require proof that a 


162 Chapter 5 


commutative ring can have only a single additive identity and a single multiplicative identity. 
Similarly, it requires proof that an element only has a single additive inverse and that if it has a 
multiplicative inverse, then it cannot have more than one multiplicative inverse. 


The way that we show that a commutative ring has only one additive identity is that we prove 
that any two elements that satisfy the additive identity property must be equal. Similarly, to 
show that an element x of R has only one additive inverse, we prove that any two additive 
inverses of x must be equal. The proofs of these facts will be rather formal and are not always 
the most exciting aspects of algebra. However, it is necessary to prove these facts. 


Proof. For part (a), suppose that e and f are both the additive identity of R. We need to show 
that e = f. Since e and f are both the additive identity, we can look at the sum e+ f in two 
different ways. Since e is the additive identity element, we have e+ f = f. However, since f 
is also the additive identity, we also have e+ f = e. Comparing these two sets of equations, 
we immediately see that e = f. The proof of the analogous fact for the multiplicative identity 
is virtually the same. Observe that if g, i are both the multiplicative identity of R, then 
g-h=hand g-h=g. Thus, g=h. 


For part (b), let x € R and suppose y, z are both additive inverses of x. We need to show that 
y =z. To do this, we will examine the expression y+ x-+ z two different ways using the 
associativity of addition. On the one hand, we have 


(yt+x)+z=04+72=2 
and, on the other hand, 

yt(x+z)=y+0=y. 
Thus, y = z, as desired. 


The ideas used to prove part (c) should remind you of the ideas used in the proof of 
Corollary 5.13 where we showed that o(0) = 0. You should compare the following argument 
with the one used in that proof. If x € R, we have 


x-0=x-040=%x-04+(%-0-—x-0) = (x-04+x-0)—x-0 
=x-(0+0)—x-0=x-0-—x-0=0. 
You should make sure that you understand why each and every equality in the preceding 
calculation follows from the properties in Definition 5.3. Since multiplication in R is 


commutative, it also follows that 0-x = 0. It is worth noting that we could also directly prove 
0-x = 0 without using the commutativity of multiplication. 


For part (d), to show that (—1) - x is the additive inverse of x, we will need to show that 
x+(—1)-x =0. That will suffice to complete the proof because the commutativity of addition 


The Complex Numbers 163 


would then tell us that (—1)-x+x = 0. To this end, we have 
x+(-1)-x=()-x4+(-1)-x=(-1)-x=0-x=0. 


Remember that you should check that all of the equalities in the preceding calculation do 
indeed follow from the earlier parts of this proof and the properties in Definition 5.3. 


Technically, proving that a set is a commutative ring requires checking that eight different 
properties all hold. Since one of our goals is to come up with an easy way to produce examples 
of commutative rings, it would be nice to develop a shortcut that, in many circumstances, will 
greatly decrease the number of properties that need to be checked. Our next proposition does 
exactly that. 


Proposition 5.15. Let S be a subset of a commutative ring R with the following properties: 
(a) Forallx,y€S,x+yand x-y both belong to S. 


(b) The additive and multiplicative identities 0 and 1 of R and —1, the additive inverse of 1, 
all belong to S. 


Then S is a commutative ring. 


Proof: Remember that before we can begin checking if a set satisfies the eight properties of 
commutative rings, we first must establish that the set is closed with respect to both addition 
and multiplication. However, in this case, things are already taken care of as (a) states that S is 
closed with respect to both addition and multiplication. 


Note that the associative laws of addition and multiplication, the commutative laws of addition 
and multiplication, and the distributive law hold for all elements of R. Since S is a subset of R, 
these five properties automatically also hold for all elements of S. As a result, we now only 
need to check on three of the eight properties of commutative rings. However, (b) stated that S 
contains both the additive and multiplicative identity elements. Therefore, all that remains is to 
show that the additive inverse of every element of S is also an element of S. However, (b) also 
states that —1 belongs to S. Therefore, if x € S, then, by (a), the product (—1)- x also belongs 
to S. However, Lemma 5.14(d) asserts that (—1) - x is the additive inverse of x. Hence, S 
contains the additive inverse of each of its elements, and S is indeed a commutative ring. 


5.4 Automorphisms and Roots of Polynomials 


We can now use Proposition 5.15 to construct examples that illustrate the interplay between 
automorphisms and roots of polynomials. These examples will illustrate that Corollary 5.13 
does have many more applications than Proposition 5.11. 


164 Chapter 5 


m Example 


Let Q(/2) = {a+ bvV2 | a,b € Q}. Some typical elements of Q(V2) are 3 —5V/2, 4/2, 
—f + 18/2, and 50. We would like to see if we can apply Proposition 5.15 to assert that 
Q(V/2) is a commutative ring. The first observation to make is that since Q(V2) is a 
subset of IR, we can use the associative, commutative, and distributive laws when adding 
and multiplying elements of Q(V2). Therefore, ifa+b/2,c+dV/2 € Q(/2), they must 
add and multiply as follows: 


(at bV2) + (c+dvV2) = (a+c)+ (b+ aV2, 
and 
(a+ bV2)-(c +dV2) = act (ad + bce) V2 + bd(W/2)* = (ac + 2bd) + (ad + be) V2. 


Looking at the preceding equations, we see that since Q is closed under addition and 
multiplication, so is Q(/2). Therefore, part (a) from Proposition 5.15 is satisfied. 
Furthermore, since Q is a subset of Q(./2), we see that Q(/2) also satisfies part (b) from 
Proposition 5.15. Thus, Q(v2) is a commutative ring. In fact, it is not difficult to go 

one more step and show that Q(V2) isa field, but we will not need that additional 
information at this point. 


Now suppose that o is an automorphism of Q(/2) that fixes all the elements of Q. 
Therefore, properties (b) and (c) of Definition 5.8 imply that 
o(a+bV2) = o(a) + o(bV2) = o(a) + o(b)o(V2) = a+ bo(V2), 


for all a,b € Q. In light of the preceding equation, if we know what o(/2) is equal to, 
then we will know the value of o for each element of Q(/2). 


In order to compute o(V2), we first observe that \/2 is a root of the polynomial 

p(x) = x* —2. The coefficients of p(x) belong to Q and are therefore fixed by o. Asa 
result, we can apply Corollary 5.13 to conclude that o(/2) is also a root of p(x). Thus, 
either 


o(V2) =V2 or o(V2) = —V2. 
Next, consider the functions 01, 02 : Q(/2) > Q(/2) defined as 


o1(a+bv2) =a+bV/2 and on(a+bV2) =a—bV2, 


for all a, b € Q. Although we have not yet shown that the functions 01, 02 are auto- 


morphisms, the preceding argument indicates that no other function could possibly be 


www-.elsevierdirect.com 


The Complex Numbers 165 


an automorphism of Q(V/2) that fixes all elements of Q. Thus, there are at most two 
automorphisms of Q(V2) that fix all elements of Q, and you should take a moment to 
check that 01, 62 are indeed automorphisms. This means that you need to check that 
01, 02 Satisfy all the conditions in Definition 5.8. Since oj is the identity map on Q(/2), 
there is very little to prove in this case. However, to show that 02 satisfies the conditions 
of Definition 5.8, you should first look at the proof of Lemma 5.7 as the ideas and 
techniques are quite similar. 


Once we know that 02 Is an automorphism of Q(/2) that fixes all elements of Q, we can 
use Corollary 5.13. For example, suppose 5 — 8/2 is a root of some p(x) € Q[x]. Then, 
Corollary 5.13 immediately asserts that o2(5 — 8/2) = 5+84/2 must also be a root of 
p(x). Using the quadratic formula, you can check that x? — 10x — 103 is an example of a 
polynomial that has both 5— 8/2 and 5+ 82 as roots. In Chapter 12, we will show 
that any p(x) € Q[x] that has 5 — 8./2 as a root must bea multiple of x? — 10x — 103. 


Similarly, if f(x) € Q[x] has —1/2+5,/2 asa root, then f(x) must also have 

02(—1/2 + 5/2) = —1/2- 5,/2 as a root. You can check that 4x2 + 4x — 199 has both 
—1/2+5,/2 and —1/2—5V2 as roots, and in Chapter 12, you will see that f(x) must be 
a multiple of 4x? + 4x — 199. 


Before leaving this example, consider the polynomial g(x) = x7 +2/2x — 6. You can 
check that /2 and —3V/2 are roots of g(x). However, neither o7(./2) = —V/2 nor 
07(—3/2) = 3V2 are roots of g(x). Observe that this does not contradict Corollary 5.13, 
as g(x) has a coefficient that is not fixed by 02. 


We should certainly note that in this example we used Corollary 5.13 in two different ways. 

In one direction, we looked at the roots of x? — 2 and showed that there are only two 
automorphisms of Q(V2) that fix every element of Q. In the other direction, we used the 
automorphism 0% to find additional roots of some polynomials once we already knew one root. 
Thus, we used roots of polynomials to find automorphisms and used automorphisms to find 
roots of polynomials. It is this interplay between automorphisms and roots of polynomials that 
will be an important and recurring theme throughout this course and much of abstract algebra. 


The computations in the next example will be more complicated than those in the previous 
example. However, the ideas and techniques used will be the same. 


Example 


Let 


Qi, V3) = {a+ bi+ceV34div3 | a,b, c,d € Qh. 


www-.elsevierdirect.com 


166 Chapter 5 


Some typical elements of Q(i, /3) are 2 —31+4/3 —7i/3, 3i +9i./3, 34, and 

12i — 17/3. Since Q(i, V3) is a subset of C, the addition and multiplication of elements 
in Q(i, V3) must satisfy the associative, commutative, and distributive laws. Therefore, 
elements of Q(i, /3) must add and multiply as follows: 


(a) + byi +13 + div’) + (ay + boi + C2V3 + driV3) = 
(ay +42) + (b1 +.b2)i + (C1 +02)V3 + (di + dp)iV3 
and 
(ay + bi +e1V3 4 div) - (az + boi + V3 + driV3) = 
(a, a2 — by bz + 3c\c2 — 3d) dz) + (ay b2 + by a2 + 3c\d2 + 3d\c2)i + 
(aycq — bydz +.c1a2 — dibo)V3 + (aydz + bic2 +. c1b2 + dian)iV3, 


for all a), a2, by, bz, 1, C2, di, dz € Q. It would be good practice for you to use the 
associative, commutative, and distributive laws to derive the preceding formulas. 


Since Q is closed under addition and multiplication, the preceding formulas indicate that 
Q(i, V3) is also closed under addition and multiplication. Thus, Q(i, /3) satisfies part 
(a) from Proposition 5.15. However, since Q is a subset of Q(i, »/3), part (b) from 
Proposition 5.15 is also satisfied, and we see that Q(i, J/3) is a commutative ring. Now 
suppose o is an automorphism of Q(i, 3) that fixes every element of Q. Then 

Lemma 5.9 asserts that 


o(a+bitcvV3+divV3) = o(a) +o(b)o(i) + o(c)o(V3) + o(d)a(i)o(V3) = 
a+bo(i) +co(V3) + do(i)o(V3), 


for all a,b, c,d € Q. The preceding equation tells us that once we know the values of o(i) 
and o(./3), then we will know the value of o for each element of Q(i, V3). 


Since i is a root of f(x) =x?—1 and V3 is a root of g(x) = x? —3, Corollary 5.13 asserts 
that o(i) must be a root of f(x) and o(/3) must be a root of g(x). Therefore, 


o(i)=ior —i and o(V3)= V3 or —V3. 


In light of this, there are at most four automorphisms of Q(i, /3) that fix the elements of 
Q. We will denote these four candidates for being automorphisms as 01, 02, 03, 04, and 
they have the properties that 


o1(i) =i and o1(V3) = V3; 09(i) = —i and on(V3) = V3; 
o3(i) =iand 03(V3) = — V3; o4(i) =-—iand o4(V3) — = 4/3, 


www-.elsevierdirect.com 


The Complex Numbers 167 


An expanded way of writing this is 


o(atbitcv3 +divV3) =a+bitcev3+div3, 

ox(a+ bi+cvV3 +-diV3) =a—bit+cV3—div3, 

o3(a+ bit cV3 +-diV3) = at bi-cV3—div3, 

o4(at+ bit+cV3 +div3) =a—bi—-cV3+div3, 
foralla,b,c,d€Q. 


It is clear that oj is an automorphism of Q(i, /3). However, at this point, in order to 
assert that 02, 03, 04 are automorphisms of Q(i, /3), we need to check that they satisfy 
all three conditions of Definition 5.8. The first two conditions of Definition 5.8 are not 
difficult to check. However, part (c) of Definition 5.8 is extremely long and tedious to 
verify. The verification is straightforward but is still long and tedious. 


However, good news awaits in Chapter 15. Indeed, Theorem 15.17 often tells us how 
many automorphisms a field has. In fact, Theorem 15.17 will tell us that Q(i, V3) has 
exactly four automorphisms that fix every element of Q. Similarly, if we apply 
Theorem 15.17 to our previous example, it tells us that Q(/2) has exactly two 
automorphisms that fix every element of Q. 


As we have seen, Corollary 5.13 can supply us with a list of candidates of the 
automorphisms of a field. When we combine Corollary 5.13 and Theorem 15.17, we will 
often know all the automorphisms of a field without going through a long series of 
computations. At this point in this course, we should technically do all the computations 
needed to verify that 02, 03, o4 are all automorphisms of Q(i, /3). However, when we get 
to Chapter 15, Theorem 15.17 will make life much easier for us. Thus, Theorem 15.17 is 
an example of a powerful piece of mathematical machinery that will help us avoid pages 
of computations. 


Once we have reached the point that we believe that 01, 02, 03, 04 are all automorphisms 
of Q(i, V3) that fix the elements of Q, we are in a position to apply Corollary 5.13. 
Suppose 2i — 6/3 is a root of some p(x) € Q[x]; then Corollary 5.13 tells us that 


o1(2i -6V3) = 2i-6V3, 0y(2i-6V3) = —21 -6V3, 
03(2i -6V3) = 21+6V3, 04(2i-6V3) = —21+ 6V3 


are all roots of p(x). Thus, p(x) must have at least four distinct roots. It turns out that 
the polynomial x* —208x” + 12,544 is an example of a polynomial in Q[x] that has these 
four roots, and our work in Chapter 12 will imply that p(x) must be a multiple of 

x* — 208x? + 12,544. 


www.elsevierdirect.com 


168 Chapter 5 


Now suppose 7 + ix/3 is a root of some g(x) € Q[x]. Then Corollary 5.13 asserts that 
o(7 +iV3) =T+iV3, o(7+iV3) =7-iV3, 
03(7 t+iV3) =7—-iV3, o04(7+iV3) =7+iV3 


are all roots of g(x). Therefore, in this case, Corollary 5.13 only provides us with one 
additional root of g(x). You can use the quadratic formula to check that x? — 14x +52 
has both 7+iV/3 and 7 —iV3 as roots. In Chapter 12, we will show that q(x) must be a 
multiple of x? — 14x +52. 


In both of our previous examples, we exploited the dual nature of Corollary 5.13. In one 
direction, we used facts about the roots of polynomials to help us find automorphisms. In the 
other direction, we used automorphisms to help us find additional roots of some polynomials. 
Before moving on, we should look at one more example. 


m= Example 


Let $= fa+b23 |a,b€ Qh and T= fa+b23 4623 | a, b,c € Q}. In the exercises at the 
end of this chapter, you will be asked to show that 23 ¢ S. |n light of this fact, we see that 
S is not closed under multiplication as des, yet 23.23 =23 ¢ S. Thus, S is not 

a commutative ring. 


Ifa, B are any elements of T, then they must be of the form 
a =a+b23 +023, B=d+e23 + f23, 


where a,b, c,d, e, f € Q. Since T is a subset of IR, when adding or multiplying elements 
of T, the associative, commutative, and distributive laws hold. Therefore, you should be 
able use these laws to derive formulas for a+ 6 and a- f. These formulas will show that T 
is closed under both addition and multiplication. By combining this with the fact that T 
contains Q, we can apply Proposition 5.15 to assert that T is a commutative ring. 


We would now like to find all automorphisms of T that fix every element of Q. Ifo is such 
an automorphism and if a = a+b23 +23, then Lemma 5.9 asserts that 


o(a + b25 +025) =o(a+b2} +2323) =o(a) +.0(b)o(25)+ 
o(c)o(28)o(2*) = a+bo(2!) +c0(24)’. 


: 1 . 
Therefore, if we can find the value of o(23), then we will know the value of o for every 


element of T. Since 23 is a root of x3 — 2, Corollary 5.13 tells us that o 23) must also be 
y 


www-.elsevierdirect.com 


The Complex Numbers 
1, ‘ : 
a root of x° — 2. However, 23 is the only root of x° — 2 that is a real number. Since 


root of x° — 2 that belongs to T, so o(25) = 23, Thus, ifa € T, we have o(a) =a. 
Therefore, o must be the identity map, and it is the only automorphism of T that fixes 
every element of Q. 


Exercises for Sections 5.3 and 5.4 


In exercises 1-10, let a = 2—i, B = 3 —2i, and y = —1 +i. 


1. 


S 


Sor OO a OY Ce ee ake 


Compute a*, 6*, and a* - p*. 
Compute a- 6 and (a- 6)* and compare your answers to those in exercise 1. 
Compute a* + B* + y*. 
Compute (a+ 6+ y)* and compare your answers to those in exercise 3. 
Compute y*, (y*)*, and (y*)*. 
Compute y”, y*, (y*)*, (v*)* and compare your answers to those in exercise 5. 
Compute a! and (a~!)*. 
Compute a* and (a*)~! and compare your answers to those in exercise 7. 

* 


Compute a* - B*- y*. 


Compute (a- 6- y)* and compare your answer to that in exercise 9. 


For exercises 11-47, please read the following: 


169 


: te ee 
T CR, it follows that 23 is the only root of x* —2 that belongs to T. However, o(23) isa 


A polynomial is called monic if the coefficient of the term with the largest exponent is one. 


Observe that every polynomial (other than the polynomial that is always zero) can be 
multiplied by exactly one nonzero constant to produce a monic polynomial. For example, 
x” — Six —4 is monic, whereas 4x7 + 4x — 199 is not monic but can be multiplied by r to 


produce a monic polynomial. 


11. 


Suppose 7 — 2i is a root of some f(x) € R[y]. 
(a) Find another root of f(x). 


(b) Find a monic polynomial of degree 1 in C[x] that has 7 — 27 as a root. 


(c) Finda monic polynomial of degree 2 in R[x] that has 7 — 2i as a root. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


170 Chapter 5 


Suppose 5 + 3i is a root of some g(x) € R[y]. 
(a) Find another root of g(x). 


(b) Find a monic polynomial of degree 1 in C[x] that has 5 + 37 as a root. 


(c) Finda monic polynomial of degree 2 in R[x] that has 5 + 3: as a root. 


Suppose V3 + 8: is a root of some h(x) € R[x]. 
(a) Find another root of h(x). 


(b) Find a monic polynomial of degree 1 in C[x] that has /3 + 87 as a root. 
(c) Find a monic polynomial of degree 2 in R[x] that has /3 + 8: as a root. 


Suppose —4+ /5i is a root of some k(x) € R[x]. 
(a) Find another root of k(x). 


(b) Find a monic polynomial of degree 1 in C[x] that has —4+ \/5i as a root. 
(c) Find a monic polynomial of degree 2 in R[x] that has —4 + /5i as a root. 


Suppose 3 —5i and —7+6i are roots of some f(x) € R[x]. 
(a) Find two more roots of f(x). 


(b) Find a monic polynomial of degree 2 in C[x] that has 3 —5i and —7+ 6i as roots. 


(c) Finda monic polynomial of degree 4 in R[x] that has 3 —5i and —7+ 6/ as roots. 


Suppose —/2+i and 3 —4i are roots of some g(x) € R[x]. 
(a) Find two more roots of g(x). 


(b) Find a monic polynomial of degree 2 in C[x] that has —/2+ i and 3 — 4i as roots. 


(c) Find a monic polynomial of degree 4 in R[x] that has —./2 +i and 3 — 4i as 
roots. 


Suppose 1+ 7i and 3 + \/6i are roots of some h(x) € R[x]. 
(a) Find two more roots of h(x). 


(b) Find a monic polynomial of degree 2 in C[x] that has 7 +7i and 3 + V6i as roots. 
(c) Find a monic polynomial of degree 4 in R[x] that has 7 + 7i and 3+ V/6i as roots. 


Suppose 5i and \/2 — 8: are roots of some k(x) € R[x]. 
(a) Find two more roots of k(x). 


(b) Find a monic polynomial of degree 2 in C[x] that has 5i and \/2 — 8: as roots. 
(c) Find a monic polynomial of degree 4 in R[x] which has 5i and \/2 — 8: as roots. 


The Complex Numbers 


19. Let Q(i) = {a+ bi | a,b € Q}. 


20. 


21. 


(a) 


(b) 


(c) 


(d) 


If a,b, c,d € Q, find A, B such that 
(a+ bi) +(c+di) = A+ Bi. 
In your answers, A and B should be expressions in a, b,c, d. 
If a, b,c, d € Q, find C, D such that 
(a+ bi)- (c+ di) = C+ Di. 
In your answers, C and D should be expressions in a, b,c, d. 


Use parts (a) and (b) along with Proposition 5.15 to show that Q(i) is a 
commutative ring. 


Show that Q(z) is a field. 


Let Q(./7) = {a+ bV7| a,b € Qh. 


(a) 


(b) 


(c) 


(d) 


If a,b, c,d € Q, find A, B such that 

(a+ bV7) + (c+dvV7) = A+ BV7. 
In your answers, A and B should be expressions in a, b,c, d. 
If a, b,c, d € Q, find C, D such that 

(a+ bV7)-(c+dV7) = C+ Dv7. 
In your answers, C and D should be expressions in a, b,c, d. 


Use parts (a) and (b) along with Proposition 5.15 to show that Q(V7) isa 
commutative ring. 


Show that Q(./7) is a field. (Hint: When you are looking for multiplicative 
inverses, think about the product (a+ b/7) -(a—bv/’7).) 


Let Q(/11) = {a+ bV11 | a, b € Qh. 


(a) 


(b) 


If a,b, c,d € Q, find A, B such that 

(a+ bV11) +(c+dV11) = A+ BV11. 
In your answers, A and B should be expressions in a, b,c, d. 
If a,b, c,d € Q, find C, D such that 

(a+ bV11)-(c+dV11) = C+ DV11. 


In your answers, C and D should be expressions in a, b,c, d. 


171 


172 Chapter 5 


22. 


23. 


24. 


(c) Use parts (a) and (b) along with Proposition 5.15 to show that Q(/11) is a 
commutative ring. 


(d) Show that Q(/11) is a field. (Hint: When you are looking for multiplicative 
inverses, think about the product (a+ bV11)-(a—bvV11).) 


Let Q(iV19) = {a+ biV19 | a,b € Q}. 
(a) Ifa,b,c,d €Q, find A, B such that 


(a+ biV19) + (c+diV19) = A+ Biv19. 
In your answers, A and B should be expressions in a, b, c, d. 
(b) Ifa,b,c,d€Q, find C, D such that 
(a+ biV19)-(c+diV19) = C+ Div 19. 
In your answers, C and D should be expressions in a, b,c, d. 


(c) Use parts (a) and (b) along with Proposition 5.15 to show that Q(iV'19) is a 
commutative ring. 


(d) Show that Q(iV/ 19) is a field. (Hint: When you are looking for multiplicative 
inverses, think about the product (a+ biv 19) - (a— biv 19).) 


Let Q(/2, V3) = {at bV2+cV/3+d6 | a,b, c,d € Qh. 
(a) Ifa,b,c,d,e, f g,h €Q, find A, B, C, D such that 


(a+bV2+cV3+dv'6) + (e+ fV¥2+ gV3 +hv6) = A+ BV2+CV34+ DV6. 
In your answers A, B, C, D should be expressions in a, b,c, d,e, fg, h. 
(b) Ifa,b,c,d,e, f g,h €Q, find E, F, G, H such that 
(a+bV2+cV3 +dv6)- (e+ fV2+ eV3+hvV6) = E+ FV2+G6V3+H V6. 
In your answers EF, F, G, H should be expressions in a, b,c, d, e, f, g, h. 


(c) Use parts (a) and (b) along with Proposition 5.15 to show that Qv2, J3) isa 
commutative ring. 


Let Q(V7, i) = {a+ bV7+ci+div7 | a,b, c,d € Q. 
(a) Ifa,b,c,d,e, f g,h €Q, find A, B, C, D such that 


(atbV74+cit+diV7) + (e+ fV7+ git hiV7) = A+ BV7+ Cit Div7. 


In your answers A, B, C, D should be expressions in a, b,c, d,e, fg, h. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


The Complex Numbers 173 


(b) Ifa,b,c,d,e, f g,h €Q find E, F, G, H such that 


(atbV74+ci+diV7)-(e+ fV7+ githiV7) = E+ FV74+ Gi+ Hiv. 


In your answers E, F, G, H should be expressions in a, b,c, d, e, f, g,h. 


(c) Use parts (a) and (b) along with Proposition 5.15 to show that Q.v/7, i)isa 
commutative ring. 


Let o be an automorphism of the field Q(i) that fixes every element of Q. Show that 
either o(i) =i or o(i) = —i. 
Use your work from exercise 19 to show that both possibilities for o in exercise 25 result 


in automorphisms of Q(i). 


Let t be an automorphism of the field Q(/7) that fixes every element of Q. Show that 
either (V7) = /7 or t(/7) = 4/7, 


Use your work from exercise 20 to show that both possibilities for t in exercise 27 result 
in automorphisms of Q(/7). 


Let v be an automorphism of the field Q(/ 11) that fixes every element of Q. Show that 
either v(V 11) = V11 or v(V 11) = —-v 11. 


Use your work from exercise 21 to show that both possibilities for v in exercise 29 result 
in automorphisms of Q(v/ 11). 


Let o be an automorphism of the field Q(iv 19) that fixes every element of Q. Show that 
either o(—iV 19) =ivV 19 or o(iV 19) = —iV'19. 


Use your work from exercise 22 to show that both possibilities for o in exercise 31 result 
in automorphisms of Q(iv 19). 


Show that every automorphism of the field Q(/2, V3) from exercise 23 that fixes 
every element of Q is completely determined by where it sends /2 and /3. Use this 
fact to show that Q(./2, 3) has at most four automorphisms that fix every element 


of Q. 


Let o be the automorphism of Q(V2, V3) such that o(a+ b/2+cV3+dvV6) = 
a—bV/2+c/3— dV/6, where a, b, c,d € Q. Describe, in terms of a, b, c, d, all elements 
of Q(/2, V3) that are fixed by o. 


Let t be the automorphism of Q(V2, V3) such that (a+ bV2+cV3 +dvV6) = 
atb/2—cJ/3—dvV6, where a, b,c,d € Q. Describe, in terms of a, b, c, d, all elements 
of Q(/2, V3) that are fixed by tT. 


174 Chapter 5 


36. 


37, 


38. 


39. 


40. 


41. 


42. 


43. 


Let v be the automorphism of Qv2, V3) such that v(a+ b/2+cJ/3+ dy/6) = 
a—bJ/2—cJ/3+d6, where a, b,c,d € Q. Describe, in terms of a, b, c, d, all elements 
of Q(/2, V3) that are fixed by v. 


Show that every automorphism of the field Q(./7, i) from exercise 24 that fixes every 
element of Q is completely determined by where it sends \/7 and i. Use this fact to show 
that Q(./7, i) has at most four automorphisms that fix every element of Q. 


Let o be the automorphism of Q(./7, i) such that o(a+bJ/7+ ci+diV7) =a—bJ/7+ 
ci — dis/7, where a,b, c,d € Q. Describe, in terms of a, b, c, d, all elements of 
Q(V7, <i) that are fixed by o. 


Let t be the automorphism of Q(/7, i) such that t(a+bJ/7+ci+di/7) =a+bV/7 
—ci — dis/T, where a, b, c,d € Q. Describe, in terms of a, b, c, d, all elements of 
Q (V7, <i) that are fixed by tT. 


Let v be the automorphism of Qv7, i) such that v(a+ bV7+cit+ div’7) =a—b/7J— 
ci+ diV7, where a, b, c,d € Q. Describe, in terms of a, b, c, d, all elements of 
Q(V7, Vi) that are fixed by v. 


2 : . 1 A 
Show that 23 cannot be written in the form a+ 523, where a, b € Q. Observe that this 
fact proves that the set S in the last example in Section 5.4 is not a commutative ring. 


Let Q(23) = {a +b23 +23 |a,b,c € Q}. 
(a) Ifa,b,c,d,e, f €Q, find A, B, C such that 


(a+02! +023) i (a+e2) + £23) = A+B234+023. 
In your final answer, A, B, C should be expressions in a, b,c, d, e, f. 
(b) Ifa,b,c,d,e, f €Q, find D, E, F such that 
1 2 1 2 1 2 
(a+02) +023) (a+e2} + £23) = D+ E234 F23, 


In your final answer, D, E, F should be expressions in a, b,c, d,e, f. 


(c) Use parts (a) and (b) along with Proposition 5.15 to show that Q(23), that is the 
same as the set 7 in the last example in Section 5.4, is a commutative ring. 


Let Q(73) = {a +b73 +c73 | a,b,c € Qh. 
(a) Ifa,b,c,d,e, f €Q, find A, B, C such that 


(a+73 +c75) +(d+e75+ £73) = A+ BTS +C73. 


In your final answer, A, B, C should be expressions in a, b,c, d, e, f. 


The Complex Numbers 175 


(b) Ifa,b,c,d,e, f €Q, find D, E, F such that 
1 2 1 2 1 2 
(a+673 +c73) ; (a+e7 + 7%) = DLE. 


In your final answer, D, E, F should be expressions in a, b,c, d, e, f. 


(c) Use parts (a) and (b) along with Proposition 5.15 to show that Q(73) isa 
commutative ring. 


44. Let Q(73) be the commutative ring from exercise 43. Show that the identity map is the 
1 
only automorphism of Q(73) which fixes every element of Q. 


45. Leta=3—2V/7 € Q(v7). You might want to refer to exercises 27 and 28 while doing 
this exercise. 
(a) If f(x) € Q[4] has @ as a root, what else can you say about the roots of f(x)? 


(b) Find a monic polynomial of degree 2 in Q[x] that has o as a root. 


46. Let B= 3 +4711 € Q(vV/11). You might want to refer to exercises 29 and 30 while 
doing this exercise. 
(a) If g(x) € Q[x] has 6 as a root, what else can you say about the roots of g(x)? 


(b) Find a monic polynomial of degree 2 in Q[x] that has £ as a root. 


47. Let y=4+5iV 19 € Q(iv 19). You might want to refer to exercises 31 and 32 while 
doing this exercise. 
(a) If h(x) € Q[y] has y as a root, what else can you say about the roots of h(x)? 


(b) Find a monic polynomial of degree 2 in Q[x] that has y as a root. 


In exercises 48-51, you will come across the sets Q(/2)[x], Q(V3)[x], Q(/6)[x], Q(V7) [x], 
Q(i)[x], QG/7)[x]. In each case, it represents the polynomials over a field F', where F 


consists of all terms of the form {a+ bC | a, b € Q}, where C is either s/2, /3, V6, A/T, 1, 
or i/7. 


48. Leta =5/2—7V3 € Q(V2, V3). You might want to refer to exercise 33 while doing 
this problem. 
(a) If f(x) € Q[4] has @ as a root, what else can you say about the roots of f(x)? 


(b) Find a monic polynomial of degree 4 in Q[x] that has ow as a root. 

(c) If give Q(./2)[x] has w as a root, what else can you say about the roots of g(x)? 
(d) Find a monic polynomial of degree 2 in Q(/2)[x] that has @ as a root. 

(e) Ifh(xeE Q(V73) [x] has a as a root, what else can you say about the roots of h(x)? 


176 Chapter 5 


49. 


50. 


aye 


(f) 
(g) 
(h) 


Find a monic polynomial of degree 2 in Q(./3)[x] that has @ as a root. 
If k(x) € Q(/6)[x] has w as a root, what else can you say about the roots of k(x)? 


Find a monic polynomial of degree 2 in Q(./6)[x] that has @ as a root. 


Let B=3- 4/2+4+2V6€ Qv2, /3). You might want to refer to exercise 33 while 
doing this problem. 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 
(g) 
(h) 


If f(x) € Q[4] has f as a root, what else can you say about the roots of f(x)? 
Find a monic polynomial of degree 4 in Q[x] that has 6 as a root. 

If g(x) € Q(v2) [x] has B as a root, what else can you say about the roots of g(x)? 
Find a monic polynomial of degree 2 in Q(/2)[x] that has B as a root. 

If h(x) € Q(V3) [x] has 6 as a root, what else can you say about the roots of h(x)? 
Find a monic polynomial of degree 2 in Q(./3)[x] that has f as a root. 

If k(x) € Q(/6)[x] has B as aroot, what else can you say about the roots of k(x)? 
Find a monic polynomial of degree 2 in Q(./6)[x] that has f as a root. 


Let y= DIT = Sie Q(.V/7 , 4). You might want to refer to exercise 37 while doing this 
problem. 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 
(g) 
(h) 


If f(x) € Q[4] has y as a root, what else can you say about the roots of f(x)? 

Find a monic polynomial of degree 4 in Q[x] that has y as a root. 

If g(x) € Q(v7 )[x] has y as a root, what else can you say about the roots of g(x)? 
Find a monic polynomial of degree 2 in Q(i)[] that has y as a root. 

If h(x) € Q@ [x] has y as a root, what else can you say about the roots of h(x)? 
Find a monic polynomial of degree 2 in Q(i)[x] that has y as a root. 

If k(x) € Q(iV7)[x] has y as a root, what else can you say about the roots of k(x)? 


Find a monic polynomial of degree 2 in Q(i/7)[x] that has y as a root. 


Let 6=143/7-—SiV7€ Qv7, i). You might want to refer to exercise 37 while doing 
this problem. 


(d) 


If f(x) € Q[4] has 6 as a root, what else can you say about the roots of f(x)? 
Find a monic polynomial of degree 4 in Q[x] that has 6 as a root. 
If gx) e Qiv7 )Lx] has 6 as a root, what else can you say about the roots of g(x)? 


Find a monic polynomial of degree 2 in Q(./7)[x] that has 6 as a root. 


The Complex Numbers 177 


(e) If h(x) € Q@ [x] has 6 as a root, what else can you say about the roots of h(x)? 
(f) Find a monic polynomial of degree 2 in Q(i)[x] that has 6 as a root. 
(g) Ifk(x)e€ Q(i/7)[x] has 6 as a root, what else can you say about the roots of k(x)? 
(h) Find a monic polynomial of degree 2 in Q(i V7) [x] that has 6 as a root. 

52. Show that if R is a noncommutative ring, then 0-x = 0, for all x € R. 


53. Let x, y, z be elements of a commutative ring such that both y and z are multiplicative 
inverses of x. Prove that y = z. 


54. Let x, y be elements of a commutative ring. 
(a) Show that x- y and (—x)-(—y) are both additive inverses of (—x) - y. 


(b) Use (a) to prove that x- y = (—x)-(—y). 


(c) Does (b) help you understand why the product of two negative numbers is positive? 


5.5 Groups of Automorphisms of Commutative Rings 


In the examples at the end of the previous section, we examined the automorphisms of various 
commutative rings. In these examples, we used roots of polynomials to determine 
automorphisms and used automorphisms to find roots of polynomials. We will now begin to 
look at these automorphisms from a slightly different perspective. 


The automorphisms 01, 02, 03, 04 of Q(i, V3) all belong to Bij(Q(i, V3)), which is the set of 
bijections from Q(i, V3) to Q(i, V3). Recall that Theorem 2.16 told us that if S is any set and 
if o represents the composition of functions, then o is a binary operation on Bij(S) that is 
associative, has an identity element, and each element has an inverse in Bij(S). 


If we let G = {o1, 02, 03, 04} then, when we compose the functions in G, we obtain the 
following table: 


(e) O71 02 03 O04 
O71 O71 02 03 O04 
02 02 O71 O04 03 
03 03 04 O71 02 
O04 04 03 02 O71 


For example, the table tells us that 


02003 = 04, 03004 = 02, 01002 =— 0), 03003 = 0}. 


178 Chapter 5 


We will not verify all 16 entries on the table, but we will check that 02 003 = o4. If 
a, b,c,d € Q, we have 


(02 003)(at+ bit cV3 + diV3) = 07(03(a+ bi + cV3 + diV3)) = 
on(a+ bi—cV3 —div3) =a— bi—-cV3. + div3 = 
os(at+ bit+cev3+div3). 


The functions 02 0 03 and o4 agree on all elements of Q(i, V3), so they are indeed the same 
function. 


At this point, it is beginning to look like G = {01, 02, 03, 04} has an algebraic structure that 
satisfies the same properties as Bij(S). Our table indicates that when g, h € G, we have 

goh eG. Next, since G is a subset of Bij(Qii, V3)), o is certainly associative. It is easy to see 
that oj is the identity element of G under o. Finally, since 


0100, =O], 02002=01, 03003=0|, 04004=0}, 


we See that every element of G has an inverse in G under o. 


As we remarked toward the beginning of Section 5.2, when several mathematical objects have 
a collection of properties in common, we often give the objects with these properties a special 
name. These lead us to 


Definition 5.16. A set G with a binary operation o is called a group if o satisfies the 
following: 


1. Associative Law: For every x, y,z € G, (xo y)oz=x0(yoZ). 


2. Identity: There is an element in G, usually denoted as e, such that xoe= x = eox, for 
every x EG. 


3. Inverses: For every x € G there is an element in G, usually denoted as x—|, such that 


1 1 


XOx  =e@=xX ox. 


Although we did not use the term group at the time, we have already been introduced to three 
large classes of groups in Sections 2.4 and 5.2. In Section 2.4 we introduced Bij(S) and, with 
composition of functions as the binary operation, it represents our first large class of groups. 
Next, in Section 5.2, we were introduced to commutative rings. When you look back at the 
properties satisfied by commutative rings, we obtain our second large class of groups, as you 
can see that every commutative ring is a group under addition. In Section 5.2, we were also 


The Complex Numbers 179 


introduced to fields. One of the exercises after Section 5.2 has you show that the product of 
any two nonzero elements of a field is also nonzero. From that fact, we obtain our third large 
class of groups, as it is now easy to see that the nonzero elements of a field form a group under 
multiplication. 


In light of our work with the automorphisms of Q(i, V3) that fix the elements of Q, it appears 
that automorphisms of commutative rings will give us another large class of groups. We record 
this as 


Definition 5.17. If L and K are commutative rings with K C L, let Gal(L/K) denote the set 
of all automorphisms of L that fix every element of K. We call Gal(L/K) the Galois group of 
L over K. 


Although we refer to Gal(L/K) as the Galois group, we haven’t yet proved that the set 
Gal(L/K) is actually a group under composition of functions. However, we have already 
proven some related facts. For example, we saw in Theorem 2.16 that Bij(S) is always a 
group. In addition, our table from earlier in this section shows that Gal(Q(i, J/3) /Q) isa 
group. We will now show that, for any commutative rings K C L, the set Gal(L/K) is always 
a group under composition of functions. 


Theorem 5.18. Jf L and K are commutative rings with K C L, then Gal(L/K) satisfies the 
following properties: 


(a) if fg € Gal(L/K), then fog € Gal(L/K); 
(b) if fg,h € Gal(L/K), then (fog)oh= fo(goh); 
(c) there is an element e € Gal(L/K) such that foe= f =eo f, forall f € Gal(L/K); 


(d) for every f € Gal(L/K), there is an element in Gal(L/K), denoted as f~!, such that 
fof =e=f of. 


Thus, Gal(L/K) is a group under the composition of functions. 


Proof. Suppose f, g € Gal(L/K); since both f and g are bijections, so is the composition 
fog. Therefore, to show that fo g € Gal(L/K), we first need to show that f o g satisfies 
properties (b) and (c) of Definition 5.8. This will tell us that f og is an automorphism. Then 
we will need to show that fo g fixes every element of K. If x, y € L, using the fact that both 
f and g satisfy property (b) of Definition 5.8, we have 


(fogat+y)= f(gat+y)) = f(g(x) + 8()) = 
S(g(x)) + f(g) = (fo g)(x) + (fog)(y). 


180 Chapter 5 


Thus, fo g satisfies property (b) of Definition 5.8. Similarly, since f and g both satisfy 
property (c) of Definition 5.8, we also have 


(fog)(x-y) = f(gx-y)) = 
S (g(x) -a(y)) = fle): f(g) = (fo g(x): (fog)(y). 


As aresult, f 0 g satisfies property (c) of Definition 5.8 and is therefore an automorphism of L. 
Next, if x € K, since both f and g fix x, we have 


(fo g)(x) = f(g(x)) = fx) =x. 


Hence, fo g also fixes every x € K, so fog € Gal(L/K), thereby concluding the proof of 
part (a). 


Since the composition of functions is associative, we know that part (b) holds. For part (c), 
suppose we let e: L — L be the function that fixes every element of L. Then e is certainly a 
bijection that also fixes every element of K. If x, y € L, then 


e(xt+y)=x+y=e(x)+e() 


and 


e(x- y) = x-y=e(x)-e(y). 


Therefore, e also satisfies properties (b) and (c) of Definition 5.8, and we now know that 
e € Gal(L/K). 


Observe that if f € Gal(L/K) and if x € L, then 


(foe)(x) = fle@)) = f(x) and (eo f)(x) =e(f(%)) = fQ). 


Thus, 


foe=f=eof 
AS a result, e is indeed the identity element of Gal(L/K), thereby proving part (c). 


Finally, let f € Gal(L/K); since f € Bij(L) and Bij(L) is a group, we know that f has an 
inverse f—! in Bij(L). Therefore, to conclude this proof, it suffices to show that 

f—! € Gal(L/K). To this end, if x, y € L, let u = f~!(x) and v= f7~!(y). Since f(u) = x and 
f(v) = y, then the fact that f is an automorphism gives us 


f'@ty =f '(fu)t+ fo)) = f-'(futv)) = (f-!o fyutv) = 
e(ut+v)=ut+tv= f'@+f'6) 


The Complex Numbers 181 


and 
fy) = fF) fo) = f"'(fu-v) = (F710 fu-v) = 
e(u-v) =u-v= fo'()- f-'0). 
Thus, f—! satisfies all the conditions of Definition 5.8 and is an automorphism of L. In 
addition, if x € K, then f(x) = x, and it immediately follows that f~!(x) = x. Thus, f~! also 


fixes every x € K, and we have shown that f—! € Gal(L/K), thereby proving part (d) and 
concluding the proof. 


We can now revisit two examples from the previous section and restate our work in the 
language of Galois groups. 


m Examples 


1. IfQ(V/2) = {a+ bvV2 | a, b € Q}, let e denote the identity map on Q(V2) and let o 
denote the automorphism of Q(/2) that sends /2 to —V/2 and fixes every element 
of Q. Then 


Gal(Q(V2)/Q) = {e, o}. 


2. IfT= {a+ b24 +23 | a, b,c € Q} and if e denotes the identity map on T, then 


Gal(T/Q) = {e}. 


When we look at commutative rings under addition, the groups we obtain are commutative. 
Similarly, when we look at the nonzero elements of a field under multiplication, the groups 
we obtain are also commutative. But in general, groups need not be commutative. In the 
exercises after this section and in Chapter 15, we will see examples of fields K C L such that 
Gal(L/K) is not commutative. However, the easiest examples of groups that are not 
commutative are groups of the form Bij(S), provided S has more than two elements. 


Proposition 5.19. If S is a set with more than two elements, then the group Bij(S) of 
bijections of S is not commutative. 


Proof: Let a, b,c be distinct element of S. Then let 01, 02 : S — S be the bijections of S 
defined as 


o\(a)=b, oy(b)=a, o1(c)=c, o1(x) =x forevery otherx eS 


182 Chapter 5 


and 
02(a) =a, oo(b)=c, o2(c)=b, o2(x) =x for every other x € S. 
Then 
(01 0.02)(a) = 01 (02(a)) = 01 (a) =, 
whereas 
(02 001)(a) = 02(01(a)) = 02(b) = c. 
Thus, 


(01 002)(a) F (0200) (4). 


Therefore, the functions 0; 0 02 and 02 oo, are different, so G is not commutative. 


This chapter began with a specific goal: to introduce and study the complex numbers C. Yet, 
we also ended up introducing the three most important objects in abstract algebra: fields, 
commutative rings, and groups. It may seem odd to introduce all three abstract objects at such 
an early stage in this course. However, there is method to this madness. The algebraic structure 
of groups and the interaction between groups and fields is an essential part of abstract algebra 
and is at the heart of the proof of the insolvability of the quintic. By gaining experience now 
with concrete examples of groups of automorphisms and their interaction with fields and roots 
of polynomials, you will be better equipped to handle some of the more theoretical aspects of 
abstract algebra needed to prove fundamental results like the insolvability of the quintic. 


Exercises for Section 5.5 


In exercises 1 and 2, we compute Galois groups. Both exercises have 12 parts, but each part is 
relatively short. Many parts of these exercises will use computations from earlier parts. 


1. Letw= —5 + 3; and then let Q(w) = {a+ ba | a, b € Q}. In this exercise, we will 
examine Q(w) and eventually determine Gal(Q(w)/Q). 
(a) Ifa,b,c,d €Q such that a+ bw = c+ doa, show that a= b andc =d. 


(b) Compute wr, w. 
(c) Show that both w and w are roots of the polynomial x7 + x+ 1. 
(d) Ifa,b,c,d €Q, find A, B such that 

(a+ bw) + (c+dw)= A+ Bo. 


A and B should be expressions in a, b,c, d. 


(e) 


(f) 
(g) 


(j) 


(k) 


(I) 


The Complex Numbers 183 
If a,b, c,d € Q, find C, D such that 
(a+ bw)-(c+dw)=C+ Do. 


C and D should be expressions in a, b,c, d. 
Use Proposition 5.15 to show that Q(w) is a commutative ring. 


If a, b € Q such that a+ bw ¥ 0, show that (a+ bw)((a— b) — bw) is a nonzero 
element of Q. 


If a, b € Q such that a+ bw ¢ 0, compute (a+ bw)!, and then show that Q(a) is a 
field. 


If o € Gal(Q(w)/Q), show that o(w) = w or o(w) = w’, and then show that 
Gal(Q(w)/Q) has at most two elements. 


Let t be defined as t(a+ bw) = a+ bw”, for all a, b € Q. Show that t(a+ bw) is 
also equal to (a — b) — bw. Also show that T is the identity map on Q(w), and use 
this fact to show that t is a bijection of the set Q(w). 


Using the formulas you found in parts (d) and (e) of this problem, show that the 
function t from part (j) is an automorphism of Q(q@) that fixes every element of Q. 


Conclude that Gal(Q(w)/Q) = {e, t}, where e is the identity map and T is the 
function from part (j). 


Let y= 5 + ar and then let Q(y) = {a+ by | a, b € Q}. In this exercise, we will 
examine Q(y) and eventually determine Gal(Q(y)/Q). 


(a) 


(e) 


(f) 


If a, b, c,d € Q such that a+ by =c+dy, show that a= b andc=d. 
Compute y”, y°. 
Show that both y and —y? are roots of the polynomial x” — x +1. 
If a, b, c,d € Q, find A, B such that 
(at+by)+(c+dy) =A+ By. 
A and B should be expressions in a, b, c, d. 
If a,b, c,d € Q, find C, D such that 
(a+ by)-(c+dy) =C+ Dy. 


C and D should be expressions in a, b, c, d. 


Use Proposition 5.15 to show that Q(y) is a commutative ring. 


184 Chapter 5 


(g) 


(h) 


(i) 


(j) 


(k) 


() 


If a, b € Q such that a+ by £0, show that (a+ by)((a+ b) — by) is a nonzero 
element of Q. 


If a, b € Q such that a+by 40, compute (a+ by)~!, and then show that Q(y) is a 
field. 


If o € Gal(Q(y)/Q), show that o(y) = y or o(y) = —y*, and then show that 
Gal(Q(y)/Q) has at most two elements. 


Let t be defined as t(a+ by) = a— by”, for all a, b € Q. Show that t(a+ by) is also 
equal to (a+b) — by. Also show that 1? is the identity map on Q(y), and use this 
fact to show that T is a bijection of the set Q(y). 


Using the formulas you found in parts (d) and (e) of this problem, show that the 
function t from part (j) is an automorphism of Q(y) that fixes every element of Q. 


Conclude that Gal(Q(y)/Q) = {e, t}, where e is the identity map and T is the 
function from part (j). 


In exercises 3—26, we let Q(/2 ; J3 ; /5) denote the field consisting of all elements of the form 


{at bV2+cV3 +dV5+eV64+ fV10+ gV15+hV30 | a, b,c, d,e, f. g,h € Q}. 


If we let G = Gal(Q(V2, V3, V5) /Q), then G consists of eight automorphisms, all of which 
are completely determined by their behavior on /2, 3, and \/5. We can represent each 
element of G in the form 04%, ;,4), where 0 <i, j,k < 1, and 


OW.i.k) (v2) =(-1'V2, 0G, (v3) = (-DiV3, 04,449 (v5) = (-DF V5. 


For example, using this notation, 0(1,0,1) (V2) = SD. 0(1,0,1) (V3) = 4/3, and 0(1,0,1) (V5) = 
—/5. You may assume different choices of a, b,c, d,e, fg, h always produce different 


elements of Q(/2, V3, /5). 


3. 
4. 


For which of the eight triples of the form (i, j, k) is o(, ;,4) the identity map? 


Show that, for each of the eight triples of the form (i, j,k), the automorphism (0, ip) is 
the identity map. 


In G, compute O(0,1,1) 9 O(1,1,0): 


InG, compute 0(1,1,1) 9°9(0,1,0)- 


In G, compute O(0,0,1) © O(0,1,0)- 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


The Complex Numbers 


In G, compute 0(1,0,1) 9 O(1,0,0)- 
In G, compute 0(9,1,0) 9 O(1,0,1)- 
In G, compute 0(1,1,0) 90 (,1,1)- 
In G, compute 0(0,1,1) 9 0(0,1,0)- 
In G, compute 0(1,0,0) 9 (1,1,0)- 


In terms of a, b,c, d, e, f, g, h, describe all elements of Qv2, /3, V5) fixed by the 
automorphism 0,1 0,0). 


In terms of a, b,c, d, e, f, g, h, describe all elements of Q.2, V3, /5) fixed by the 
automorphism 0,0,1,0). 


In terms of a, b, c, d, e, f, g, h, describe all elements of Qv2, V3, 5) fixed by the 
automorphism 0,0,0,1). 


In terms of a, b, c, d, e, f, g, h, describe all elements of Qv2, V3, V5) fixed by the 
automorphism 0,1 ,1,0). 


In terms of a, b,c, d, e, f, g, h, describe all elements of Q.y2, V3, /5) fixed by the 
automorphism 0,1 ,0,1). 


In terms of a, b, c,d, e, f, g, h, describe all elements of Q(v2, V3, 5) fixed by the 
automorphism 0(0,1,1). 


In terms of a, b, c, d, e, f, g, h, describe all elements of Qv2, /3, V5) fixed by the 
automorphism 0,1 ,1,1). 


In terms of a, b, c, d, e, f, g, h, describe all elements of Q v2, /3, V5) fixed by both 
0(1,0,0)- and 0(0,1,0)- 
In terms of a, b, c, d, e, f, g, h, describe all elements of Q(v2, J3, 5) fixed by both 
0(1,0,0)- and 0(0,0,1): 
In terms of a, b, c, d, e, f, g, h, describe all elements of Qv2, /3, 5) fixed by both 
0(0,1,0)- and 0(0,0,1)+ 
In terms of a, b, c, d, e, f, g, h, describe all elements of Qv2, 73, 5) fixed by both 
O(1,1,0)- and 0(0,0,1)- 
In terms of a, b, c,d, e, f, g, h, describe all elements of Q(v2, J/3, 5) fixed by both 
0(0,1,0)- and O(1,0,1)- 


In terms of a, b, c, d, e, f, g, h, describe all elements of Q.y2, V3, 5) fixed by both 
0(0,1,1)- and 0(1,0,0)- 


185 


186 Chapter 5 


26. In terms of a,b,c, d,e, f, g,h, describe all elements of Q.v2, al 5) fixed by both 
0(0,1,1)- and 0(1,0,1)- 


In exercises 27-45, we let w = —5 + Bi and then let Q(23, @) denote the field consisting of 
all elements of the form 


{a+ b23 +23 +dw+ e234 f23w|a,b,c,d,e, f € Qh. 


If we let G = Gal (Q(23, @)/Q), then G consists of six automorphisms, all of which are 
completely determined by their behavior on 23 and o. In these exercises, we will see that G is 
not commutative, so Galois groups need not be commutative. We can represent each element 
of G in the form o(;,;), where 0 <i < 2,1 <j <2, and 


1 lea 
9¢,(23) =230', o¢,)(o) =a’. 


For example, using this notation, o(1,2) (23) —23w and O(1,2)(@) = w~. Recall that since 

w°* = 1 and @? +@+1=0, we can use the terms w” and —1 — w interchangeably. You may 
1 

assume different choices of a, b, c,d, e, f always produce different elements of Q(23, w). 


27. For which of the six pairs of the form (i, j) is 0%, ;) the identity map? 
28. In G, compute 09,2) 000.2). 

29. In G, compute o(1,2) 0 0(1,2). 

30. In G, compute o(2,2) 0 0(2,2). 

31. In G, compute o(1,1) 9 0(1,1) and 0(1,1) 0 0(1,1) 9 O(1,1)- 
32. In G, compute o(2,1) 0 0(2,1) and 02,1) 0 0(2,1) 9. 0(2,1). 
33. In G, compute 09,2) 0 0(1,2) and 0(1,2) 00,2). 

34. In G, compute 0(0,2) 9 0(2,2) and 0(2,2) 00(0,2). 

35. In G, compute 0(2,2) 0 0(1,2) and 0(1,2) 0 0(2,2). 

36. In G, compute 09,2) 00(1,1) and 0(1,1) 00,2). 

37. In G, compute o(1,2) 9 0(1,1) and 01,1) 00(1,2). 

38. In G, compute o(2,2) 00(1,1) and 01,1) 00(2,2). 

39. In G, compute 09,2) 0 0(2,1) and 02,1) 00,2). 

40. In G, compute o(1,2) 0 0(2,1) and 0(2,1) 0 0(1,2). 


41. In G, compute 0(2,2) 9 O(2,1) and 0(2,1) 9 O(2,2)-. 


The Complex Numbers 187 
42. Interms of a, b,c, d,e, f, describe all elements of Q(23 , w) fixed by the automorphism 
O(0,2) % 


43. In terms of a, b, c,d, e, f, describe all elements of Q(23, @) fixed by the automorphism 
O(1,2)- 


44. Interms of a, b,c, d,e, f, describe all elements of Q(23, @) fixed by the automorphism 
0(2,2) F 


45. The same elements of Q(23, w) are fixed by the automorphisms 01,1) and o(2,1). 
Describe these elements in terms of a, b,c, d, e, f. 


For exercises 46-53, let S = {a, b, c, d} and let o, t be the elements of Bij(S) defined as 
o(aj=b, o(b)=c, o(c)=d, o(d)=a, 
ta)=d, tb)=c, tc)=b, tid)=a. 
In each of the exercises below, compute the values of the given functions at a, b, c, and d. 
46. coo 
47. coaogd 


48. coaoaoag 


49. tot 
50. cot 
51. too 


52. doToO 


53. TOOOT 


This page intentionally left blank 


The Fundamental Theorem 


of Algebra 


In this chapter, we take a more geometric and concrete view of the complex numbers. We first 
take a geometric view of the real numbers and then see how this approach extends, in a natural 
way, to the complex numbers. This will provide us with the background we need to prove the 
Fundamental Theorem of Algebra. 


In Chapter 4, we indicated that it is often very useful in mathematics to describe objects in 
terms of equivalence relations and equivalence classes. This philosophy will assist us in 
examining both the real and complex numbers from a more geometric perspective. 


6.1 Representing Real Numbers and Complex Numbers 
Geometrically 

We are accustomed to viewing a real number as a point on a number line. However, we can 

also view real numbers as arrows lying on a number line. For example, the number 2 can be 

viewed as the arrow beginning at 0 and ending at 2. 


Copyright © 2010 by Elsevier Inc. All rights reserved. 1 89 


190 Chapter 6 


However, suppose we also consider the arrow which begins at 5 and ends at 7. Or we could 
examine the arrow that begins at —3 and ends at —1. Both of these arrows could also be used 
to represent the number 2. 


Similarly, the arrow beginning at 5 and ending at 2 also represents the number —3, as does the 
arrow beginning at 1 and ending at —2. 


We can see that there are an infinite number of arrows that represent each real number. In fact, 
given areal number a and a point P on the number line, there is exactly one arrow beginning 
at P that represents a. Since many different arrows can represent the same number, hopefully 
you can see that this is a perfect situation to apply equivalence relations and equivalence 
classes. In this context, we say that two arrows are equivalent if they have the same length and 
are going in the same direction. There are only two possible directions in which an arrow can 
go. Positive real numbers are represented by arrows that go to the right and negative real 
numbers are represented by arrows that go to the left. Zero is somewhat special, and it can be 
represented by an arrow consisting of only a single point. It is easy to see that we have 
described an equivalence relation on the set of arrows. 


For example, the arrows going from 0 to 2, from 5 to 7, and from —3 to —1, all belong to the 
same equivalence class and represent the number 2. Similarly, the arrows going from 0 to —3, 
from 5 to 2, and from 1 to —2, all belong to the same equivalence class and represent the 
number —3. 


We can now view the addition of real numbers geometrically. Suppose a, b € R; we will 
describe and illustrate how to find an arrow representing the sum a+ b. First, choose any arrow 
representing a and then let P and Q denote, respectively, the beginning and end of the arrow. 


The Fundamental Theorem of Algebra 191 


Next, look at the arrow that represents b and starts at Q. Then let R denote the endpoint of 
this arrow. 


For example, suppose we wish to add the numbers 2 and —3. First, we choose any arrow that 
represents 2. There are many choices we could make, but suppose we pick the arrow that 
begins at 4 and ends at 6. 


At the point 6, place an arrow that represents the number —3. This arrow will end at the 
point 3. Therefore, the sum 2 + (—3) can be represented by an arrow that begins at 4 and 
ends at 3. 


2+(-38) 
ie 
ee 
2 

Kb --——— > 
| | 
| | 
3 4 6 


Given a real number a, we can let |a| denote the length of the arrows that represent a. Observe 
that |a| is the same as the ordinary absolute value of a. We now state, without proof, two 
familiar facts about the absolute values of real numbers that will be generalized to complex 
numbers later in this chapter. 


192 Chapter 6 


Lemma 6.1. Jfa,b€R, then 
(a) |a-b|=|a|- |b], 
(b) |a+b| < |a|+|OI. 


Having discussed how to visualize real numbers as equivalence classes of arrows on a number 
line, we now begin discussing how to visualize complex numbers as equivalence classes of 
arrows in the plane. Fortunately, examining arrows in the plane is, in some ways, easier and 
more natural than examining arrows on a line. 


Given two arrows in the plane, we consider them to be equivalent if they have the same length 
and are going in the same direction. For example, the arrow beginning at the point P; = (1, 2) 
and ending at the point Q; = (4, 6) is equivalent to the arrow beginning at P) = (—1, 0) and 
ending at the point Q2 = (2, 4). Both of these arrows have length 5. Furthermore, these arrows 
are going in the same direction as they are both going upward and both lie on lines whose 


4 
slope is 5. 


Equivalence classes of arrows can be added in the plane in virtually the same manner as 

they were on the line. Suppose [7] and [v] are two equivalence classes of arrows in the plane. 
Let n, be any arrow in the plane belonging to the equivalence class [7]. Then n; begins at some 
point P and ends at some point Q. There is some arrow, which we will call 1, that belongs 

to the class [v] and begins at the point Q. We will let R denote the endpoint of vj and now 


The Fundamental Theorem of Algebra 193 


let yy denote the arrow that begins at P and ends at R. The equivalence class [yw] that contains 
w is now the sum [7] + [v]. 


Q 


P 


Since we are actually adding equivalence classes of arrows and not just arrows, we need to 
check that addition is well defined. To illustrate this, observe that we could have chosen a 
different element 72 € [yn]. Then 2 begins at some point P; and ends at Q;. Now let v2 be the 
element of [v] that begins at Q;. We now let R; denote the endpoint of v2 and let w; denote 
the arrow that begins at P; and ends at R;. Therefore, the equivalence class [yr] that contains 
wy is also the sum [7] +[v]. As a result, to show that addition is well defined, we need to show 
that the equivalence classes [y] and [yw] are the same. This will be the case precisely if the 
arrows w and y; have the same length and go in the same direction. To see that this is the case, 
we refer to the following diagram. 


R ” R, Yy 


P P, 


Since n; and n2 are equivalent, line segments PQ and P; Q; must have the same length. 
Similarly, since v; and v2 are equivalent, line segments QR and Q, R; also have the same 
length. Furthermore, since 7; and 72 go in the same direction, as do v, and vo, it follows that 
angles PQR and P; Q; R; must be congruent. Using side-angle-side (SAS), we see that 
triangles POR and P; Q, R, are congruent. As a result, line segments PR and P; R; must have 
the same length and are parallel. Hence, the arrows y and y; have the same length and go in 
the same direction. Therefore, [yr] and [w,] are indeed the same equivalence class. A more 
formal argument would also handle the special case where P, Q, and R lie on the same line. 
The preceding discussion, although somewhat informal, hopefully convinces you that the 
addition of equivalence classes of arrows is well defined. 


194 Chapter 6 


We can now associate to every complex number an equivalence class of arrows in the plane. If 
a=a+bieéC, we associate to it the class consisting of all arrows equivalent to the arrow 
beginning at the point (0, 0) and ending at the point (a, b). 


(a, b) 


So far, we have shown how to add equivalence classes of arrows in the plane and also how 

to associate to every complex number an equivalence class of arrows in the plane. The next 
step in viewing addition in C geometrically is to show that the addition of two complex 
numbers corresponds to the addition of the equivalence classes that represent these numbers. 
In other words, we need to prove that if we add arrows that represent the complex numbers a 
and £, then we obtain an arrow that represents the complex number a + f. To do this, we begin 
with a simple but important observation. Suppose 6 =c+di € C and let v be any of the arrows 
that are in the equivalence class associated to 6. Then v begins at some point P = (x1, y,) and 
ends at some point Q = (x2, y2). On the other hand, we can also look at the arrow representing 
B that begins at (0, 0) and ends at (c, d). We can then consider the right triangle whose 
vertices are P, Q, and (x2, y;) and observe that it is congruent to the triangle with vertices 
(0,0), (c,d), and (c, 0). 


(Xo, Y2) 
Q 


(x1, nS (%2, V1) 


The Fundamental Theorem of Algebra 195 


In light of this, it is easy to see that 

C=x2—-x, and d=y2—-y,. 
As a result, 
(1) xX2=x,+c and yw=y,4+d. 


If we letw=a+bi, B=c+di € C, first consider the arrow beginning at (0, 0) and ending at 
(a, b). At the point (a, b), place an arrow that is associated to 6 =c + di. The equations in (1) 
now tell us that since this arrow begins at (x,, y,) = (a, b), it must end at the point (x2, y2) = 
(a+c,b+d). 


(a+c,b+d) 


Thus, we have added an arrow that represents a to an arrow that represents 6 and obtained an 
arrow that begins at (0, 0) and ends at (a+c,b+d). However, the arrow that begins at (0, 0) 
and ends at (a+c,b+d) represents the complex number a+ 6=(a+c)+(b+d)i. 
Therefore, when we add arrows that represent a and 6, we obtain an arrow that represents 
a+ B. Thus, the addition of complex numbers does indeed correspond to the addition of 
equivalence classes of arrows in the plane. 


When adding the complex numbers @ and 6 geometrically in the preceding diagram, we saw 
that the arrow representing a+ f is the third side of a triangle where the other two sides are 
arrows representing a and f. Recall that in a triangle, the length of one side of a triangle can 
never exceed the sum of the lengths of the other two sides. This is a geometric fact that we 
should certainly express algebraically. However, we first need to define the length of a 
complex number. If a~=a-+ bi € C, then one of the arrows representing a begins at (0, 0) and 
ends at (a, b). Therefore, this arrow can be thought of as the hypotenuse of a right triangle 
where the lengths of the other two sides are |a| and |b]. 


196 Chapter 6 


lal 


The Pythagorean Theorem tells us that the length of this arrow must be / lal? + |b|?. However, 
since |a|? =a? and |b|? =b?, this is the same as Va? + b2. This motivates 


Definition 6.2. [fa=a+bié C, then we define the length or norm of a, denoted as |a|, to be 


V a2 + b?. 


Observe that if a € R, then viewing a as the complex number a+ Oi, we have |a| = 

Ja? +02 = Va, which is identical to the absolute value of a. Thus, the concept of the length 
of a complex number generalizes the notion of the absolute value of a real number. We can 
now record some of the basic properties of the length of complex numbers. As suggested 

by an earlier diagram, property (c) is, not surprisingly, referred to as the triangle inequality. 


Lemma 6.3. Leta=a+bi, B=c+di € C; then 


(a) ja|=lo*|= Ja-a* and |a|? =|a*|? = a-a*, where o* is the complex conjugate 


of a, 
(b) |a-Bl=|a|-|Al, 
(c) ja + Bl <lo|+|BI, 
(d) ifa,,a2,...,a, €C, then |a; + a2 +--+ a@,|<Ja1| + lao| + ---+]an|. 


Proof. For part (a), observe that 


a-a* =(a+bi)(a—bi) =a’ +B’. 


Thus w-a* =|a|? and /a-a* = Ja? +b? = |a|. Since a* =a — bi and a2 +b? =a? + (—b)’, it 
immediately also follows that |a*| = /a-a* and |a*|? =a-a*. 


For part (b), recall, from Lemma 5.7(c), that (a- B)* = a* - 6*. Also recall, from Lemma 5.10(b), 
that w-a* and 6- 6* are nonnegative real numbers. Combining these facts with part (a), we 


The Fundamental Theorem of Algebra 197 


see that 
ja B| = y/(a- B)(a- B)* = Va Bor - BY = /(a-a*)(B- B*) = 
Voor \/B- B* = lal -[Bl, 
thereby proving part (b). 


From a geometric perspective, part (c), the triangle inequality, is intuitively quite clear. 
However, an algebraic proof is somewhat technical and not very intuitive. To prove part (c), 
we begin by observing that if y=e+ fi € C then, by Lemma 5.10(b), y+ y* € R. Therefore, 


yty*=(et fi) + (e— fi) =2e < 2le|=2Ve2 <2fe24 f2=2lyl, 
and so we can conclude that 
(2) yty* <2lyl. 
If we let y=a- f* in (2) and use part (b) along with the fact that |6|=|/*|, we obtain 
(3) a-B'+a"-B=(a-B")+(a- B*)* < 2\a- B*|=2|a||B*| =2\a|| BI. 
Using part (a) along with (3), we now have 
(lo + Bl)” = (a+ Bat B)* = (at Ba" + B*)=a-0* + BB +a-B  +0%- B= 
jal? +B)? +a B* +0%*- B<|or|* + |B|* + 2lor|| 6] = (eel +161)”. 


Taking square roots of the left-hand and right-hand terms in the preceding inequality yields 
la+ B| <|a|+ ||, as desired. 

Part (d) is a generalization of part (c) and can be proved with Mathematical Induction. Let T 
be those positive integers k such that whenever @,...,a@,% € C, we have 


ly + +++ ay] < |ao| +--+ + lag. 


To see that 1 € T, we merely observe that if a, € C, then |a;| < |a;|. Using Mathematical 
Induction, in order to show that T=N, it now suffices to show that whenever J contains a 
positive integer k, then T also contains k+ 1. To this end, let a1, ..., ax, @¢41 € C. For 

convenience, let =a ,+---+a,z. Now, using the part (c) and the fact that k € T, we have 


Joey te tor +e y |= (1 +++ tor) Foyt | = 1b +g] < [8] + loggil < 
(lory| + +++ log]) + lorgea| = lor) +++ fog] + loyal. 


Thus, k+1 € T, thereby concluding the proof. 


198 Chapter 6 


In mathematics, we are often pleasantly surprised that ideas and techniques used to solve one 
problem can frequently be used to solve completely different problems. As an example, let us 
consider the problem of determining which integers can be written as a sum of squares of two 
integers. On the surface, this problem seems to deal only with the integers and doesn’t appear 
to be related in any way to complex numbers. Let us begin by observing that 


17=17+4? and 20=27+4+4?7. 


Suppose we wish to determine if 340 can be written as a sum of two squares. At first glance, 
this problem seems to have nothing to do with our study of the complex numbers. However, if 
we letaw=1+4i and B=2+ 4i, then we have 


17=a-a* and 20=8-f*. 
Performing a computation similar to one in the proof of Lemma 6.3(b), we have 
340 = 17-20=(a@-a")(B- B") = (a: B)(a- B)”. 
However, it is easy to compute that a- B= — 14+ 12i. Therefore, 
340 = (a- B)(a- B)* = (—144 12i)(—14 — 127) = (—14)? + 127 = 147 + 127. 
Thus, 340 is also a sum of squares of two integers. 


The preceding computation illustrates the fact that if two integers can each be expressed as a 
sum of squares of two integers, then their product can also be expressed this way. We now use 
ideas similar to those used in the proof of Lemma 6.3(b) to prove this fact. 


Proposition 6.4. Suppose n, m are integers that can be written as a sum of squares of two 
integers. Then their product n-m can also be written as sum of squares of two integers. In 
particular, ifn =a* +b? and m=c? +d’, thenn-m= (ac — bd )* + (ad + bc)’. 


Proof. Suppose n =a” + b* and m=c? + d’, then 


n-m=(a? +b’)(c? +d’) =(a+bi)(a+ bi)* (c+di)(c+di)* = 
(a + bi)(c +di)(a+ bi)*(c +.di)* = ((a+ bi)(c + di))((a+ bi)(c+di))* = 
((ac — bd) + (ad + bc)i)((ac — bd) + (ad + bc)i)* = (ac — bd)” + (ad + be)’. 


A complete description of those integers that can be expressed as a sum of squares of two 
integers can be found in a course in number theory, and Proposition 6.4 is an important tool 
used to find that description. 


The Fundamental Theorem of Algebra 199 


m Examples 


Since 58 =3*+7* and 98 =77 +77, we now have 


58-98 = (3°: 7)(7 47) =34+7)3-WI+W-T)= 
((3+7i)(7 + 7i))(( — 7i)(7 — Ti)) = (—28 + 707) (—28 — 70i) = 
(—28)? + 707 = 287 + 707. 
Similarly, 26= 12 +52 and 74=57+7’, so 
26-74 = (17 +.5*)(5° + 7°) =(1+5i)(1 — 5i)(5+7i) (5 — Ti) = 
((1+5i)(5+7i))((1 — 5i) (5 — 7i)) = (—30 + 32) (—30 — 32i) = 


(—30)* + 32? = 307 + 327. 


Adding complex numbers is quite straightforward. It is no different from adding polynomials 
with real coefficients of degree at most 1. Thus, if there is anything you might find confusing 
or mysterious about the complex numbers, it would be multiplication. After all, the existence 
of two complex numbers, i and —i, whose square is —1 causes some people to be 
uncomfortable with the complex numbers. However, in the next section, we will see that when 
we view complex numbers geometrically, multiplication becomes very natural. 


6.2 Rectangular and Polar Form 


Since any complex number a can be written in the form a =a-+ bi, we can describe a using the 
two real numbers a and b. We call a+ bi the rectangular form of a because, loosely speaking, 
the arrows that represent a correspond to one of the diagonals in a rectangle with sides a and b. 


2 (a, b) 


However, there is an alternative way to describe complex numbers using two real numbers. 
Recall, we consider two arrows in the plane to be equivalent if they have the same length and 
go in the same direction. Therefore, if we use one real number to describe its length and 


200 Chapter 6 


a second to describe its direction, then we have a new way of describing complex numbers 
with two real numbers that may be more natural than rectangular form. To be more precise, if 
a=a-+ bi, then the length of a, |a|, is equal to a? + b?. Next, in an attempt to describe the 
direction of the arrows representing a, let 0 be the angle made by the positive x-axis and the 
arrow representing @ that begins at (0, 0) and ends at (a, b). 


(a, b) 


Observe that 
a a b b 
cos(@) = ————. = — and sin(@) = ———_ = —., 
(2+ a Vae+b2 || 
Therefore, we now have 


b 
am atbi=ver + ( i )- 


+ i 
Ve+e JVae+h 


a db. : : 
|a| (< + *) = |a|(cos(@) + sin(@) -i). 
|o| — |a| 


Since sin(9) -i=isin(@), we can write a = |a|(cos(@) +isin(@)). 


In particular, we can now completely describe aw using the nonnegative real number |a| along 
with the real number 0. We call w= |a@|(cos(@) +isin(9)) the polar form of a. If n € Z, 
observe that 


cos(9)= cos(@+n-27) and sin(0)=sin(@+n-2z7). 


Therefore, any complex number can be represented in more than one way in polar form as we 
can replace 6 by 6+ n- 27, for any n € Z. This fact will turn out to be very useful when, in 
Theorem 6.8, we see how to find roots of complex numbers. The following examples indicate 
that it is not difficult to go back and forth between rectangular and polar form. However, we 
should be aware that if @ is in rectangular form, then we may not be able to compute the value 
of # by hand. Conversely, if a is in polar form, then we also may not be able to compute the 
values of cos(@) and sin(@) by hand. 


The Fundamental Theorem of Algebra 201 
m Examples 


When converting a € C between rectangular and polar forms, you should first draw the 
arrow representing a@ that begins at (0, 0). Viewing @ in this way will make the conversion 
more understandable. For two of the examples that follow, we include such a diagram. 
You should draw the appropriate diagram when you work through the other examples. 


Converting from rectangular to polar form: When converting to polar form, it is almost 
always best to first compute and factor out the length of the complex number. This will 
be reflected in all of our examples. 


—17i=17(0—i) =17 (cos (7) +isin (57)). 


15 — 1573i =30 (3-73 —30 (cos (3:7) +isin ($zr)). 
4/2 —4/2i= 0(F- 22) =8(cos (Za) +5 (Fx). 


4 4 
44i=5( S40 =)= 5 (cos (0) +isin (@)), where 0= arctan (>), 


M| Ww 


=}. = /40 (cos (¢) + isin (@)), 


6—21=V40( 
3): 


Ja) 


|e 


where @= 27 — arctan ( 


www.elsevierdirect.com 


202 Chapter 6 


The last two conversions cannot be simplified further, as we cannot, by hand, compute 
arctan (3) or arctan (3). 


Converting from polar to rectangular form: When converting to rectangular form, if 
possible, you should first try to compute the values of the sine and cosine. 


7 (cos (Fx) +isin (;7)) (=F 4.2) = a 


12 (cos (5) + isin (5)) Oi i= 197 
58 (cos (57) +isin (<7)) =58 (2 +i- 3) = 29/3 — 29i, 
10 (cos (=) +isin (=)) = 10cos (=) + 10sin (5) i. 


The last conversion also cannot be simplified further as we cannot by hand compute 
cos (4) or sin (4). This situation is very similar to one we face when we solve 


various algebra problems. Many algebra problems have terms in their answer that we 


; oe 
cannot compute by hand, like 7 or 353. Similarly, other terms that we cannot compute 


by hand, like arctan ($) or cos (4 


=), may well appear when converting between polar 
and rectangular form. 


www-.elsevierdirect.com 


The Fundamental Theorem of Algebra 203 


Exercises for Sections 6.1 and 6.2 


1. Please refer to the following diagram. 


(a) How many different equivalence classes of arrows are there? 
(b) What real numbers are represented by these arrows? 


2. Please refer to the following diagram. 


(a) How many different equivalence classes of arrows are there? 
(b) What real numbers are represented by these arrows? 


3. Please refer to the following diagram. 


(a) How many different equivalence classes of arrows are there? 
(b) What real numbers are represented by these arrows? 


4. Please refer to the following diagram. 


204 Chapter 6 


(a) How many different equivalence classes of arrows are there? 
(b) What real numbers are represented by these arrows? 


5. Please refer to the following diagram. 


4 — 
3-—- L7 
_ 2a we 
y ke 7—- van 
| | | | | | | | | | | | | | 
| | | | | | [7 | | | | | | | 
7 6 5 4 3 2 es str 1 2 3 4 5 6 7 
Lc “7 Says v7 Lc PN 
a ead ue 
Sea 
—4 + 
(a) How many different equivalence classes of arrows are there? 
(b) What complex numbers are represented by these arrows? 
6. Please refer to the following diagram. 
A 3s OT 
f | 
/ 
A ; 2+ 
/ j 
/ ; | 
/ ; ;— ov 
/ “T 
/ 7 l 
| i ae | | | | | | | | 
Le [tk A ! 
7 -6 -5 -4 +3 +2 1 1 2 83 5 /6 7 
~~ ={—E Vv eg 


The Fundamental Theorem of Algebra 205 


(a) How many different equivalence classes of arrows are there? 


(b) What complex numbers are represented by these arrows? 


7. Please refer to the following diagram. 
A 3—- _--> 
§ a hagetts 
ra as 
/ 2 a a r 
/ \ 
/ AA 
x ae \ 
/ Ne 
| | . | | | | | | | | | | | 
| l I I eS ae | I | | fanz | 
a ee ee Te, eh 28h pa BORE 47 
Lae tp pene 7 
-—2 Bhs 0 8 RT > 
-3 + 
(a) How many different equivalence classes of arrows are there? 
(b) What complex numbers are represented by these arrows? 
8. Please refer to the following diagram. 
3 A a 
/ 
! / 
_-? 2 — > ! 
Pee ! / / 
ee ! / ! 
rs / 
ane i ar ! | 
/ / : / 
/ / 
| | | | | | | fi te | | | | 
al 65 rat a 
7-6 5 4 3 2 1 1 ; 2 3 4 5 6 
} we] ls l Vv 
I | 
/ / 
2+ v 
-3 —1 


(a) How many different equivalence classes of arrows are there? 


(b) What complex numbers are represented by these arrows? 


206 Chapter 6 


In exercises 9-12, find the complex number represented by the arrow beginning at the point 
P and ending at the point Q. 


9. 
10. 
11. 
12. 


P=(3,11), Q=(-8, 14) 
P=(1+¥2,-6), Q@=(-6+~¥2,-19) 
P=(-4,12), Q=(17,6) 

P=(z,e), QO=(6—7,13+4e) 


In exercises 13-16, you are given the starting point of an arrow used to represent the complex 
number 7 — 5i. Find the coordinates of the endpoint of the arrow. 


13. 
14. 
15. 


16. 


Starting point = (3, 8) 

Starting point = (—3, —8) 
Starting point = (0, 4) 

Starting point = (v2, 4- V7) 


For exercises 17-28, let-w=3+4i, B=12—5i, y=1+2i, 6=5+4+8i. 


17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 
25. 
26. 
27. 
28. 
29. 
30. 


Find |a|. 

Find |6|. 

Find |y|. 

Find |6]. 

Find |aé]. 

Find |p*y|. 

Find |a* B6|. 

Find |a™'|. 

Find |A?|. 

Find |y~?|. 

Find |(5*)4]. 

Find |a+ B|. 

Find eight complex numbers a =a + bi such that |a| = /5 and a, b € Z. 
Find eight complex numbers 6B = c+ di such that |B| = /13 and c,d € Z. 


The Fundamental Theorem of Algebra 207 


31. Find twelve complex numbers w=a-+ bi such that |a|=5 and a,b € Z. 
32. Find twelve complex numbers 6 =c+di such that |B] = 13 and c,d € Z. 
33. (a) IfaeZ, what are the only remainders that can occur when a? is divided by 4? 


(b) Ifa,b¢€ Z, what are the only remainders that can occur when a’ +b? is divided 
by 4? 


(c) Use part (b) to show that there do not exist integers a, b such that 458931 =a* +b’. 


(d) Generalize part (c) to describe an infinite collection of positive integers such that 
none of them can be written as a sum of squares of two integers. 


For exercises 34-39, you may wish to use the following: 
2=7 +1, 5=1°+2?, 13=274+3°, 17=17+4?, 29=2°+5°. 


34. 
35. 


se the ideas in Proposition 6.4 to find integers A, B such that 58 =2-29= A? + B?. 
se the ideas in Proposition 6.4 to find integers C, D such that 85 =5-17=C?+ D?. 


37. 


U 
U 

36. Use the ideas in Proposition 6.4 to find integers E, F such that 221 =13-17= E* + F?. 
Use the ideas in Proposition 6.4 to find integers G, H such that 493 = 17-29=G*+ H?. 
U 


38. Use the ideas in Proposition 6.4 to find integers J, K such that 


2210=2-5-13-17=J*+ K”. 


39. Use the ideas in Proposition 6.4 to find integers L, M such that 
64090 =2-5-13-17-29=L?+ M’. 


40. (a) Give an example of two positive integers a, b such that neither a nor b nor ab can be 
written as a sum of squares of two integers. 


(b) Give an example of two positive integers a, b such that neither a nor b can be 
written as a sum of squares of two integers, but ab can be written as a sum of 
squares of two nonzero integers. 


In exercises 41-44, write all four of the complex numbers in polar form. 
41. 73, -—73, 73i, —73i. 

42. 14+73i, -1-V3i, -14+V3i, 1-V3i. 

43. 5V3+5i, —5V3-5i, -5V3+45i, 5V3—Si. 

44. 1141li, -Wl-1li, -1141li, 11-11. 


208 Chapter 6 


For exercises 45—48, let 6 = arctan ( 3) and write all four of the complex numbers in polar 
form. You should leave your answer in terms of 6. Recall that if aw is a positive real number, 
then arctan(q) + arctan (4) =F, 


45. 443i, -4-3i, 4-3i, -—4+33i. 


46. 744i, -7-4i, 7-44, -74-4ti 

AP. 9A SB BSA... asa 
3,4; 3.4, 3. 4. Ba Ae 

48. gg Sg gh: eg gh. ga he 


In exercises 49-52, write all four of the complex numbers in rectangular form. 
49. cis (4), 4cis (4), 15cis (4), 2V3cis (+). 
50. cis (22) ,  3cis (2Z), 80cis (22) , 7A/2cis (22). 


51. cis (32), 8cis (=), 3./Scis (+), 173 cis (32). 


52. cis(42), 26cis (42), 11V3cis (4), 1208 cis (2). 


For exercises 53-56, let 6 = arctan (2) and write all four of the complex numbers in 


rectangular form. Once again, recall that if w is a positive real number, then 


arctan (a) + arctan (4) =3: 


53. cis(@), ~V/74cis(0), 6v74cis(0), 60cis(A). 

54. cis(m—6), cis(t+6), cis2Qx—6), V148cis(x —8). 

55. cis(Z—6), cis(Z+0), 2V/74cis(Z—6), 94/74cis (% — 8). 
56. cis(2t—6), cis(F+0), SV 74cis (FA), 23.V74cis (F —4). 


6.3 Demoivre’s Theorem and Roots of Complex Numbers 


At this point, it is natural for you to wonder if we have gained anything by looking at complex 
numbers in polar form. Certainly, adding complex numbers is much easier when they are in 
rectangular form. However, as we will soon see, multiplying complex numbers is easier when 
they are in polar form. To see this, we first need to recall two basic facts from trigonometry 
about the cosine and sine of the sum of angles, which we state without proof. 


Lemma 6.5 
cos(0+ ¢) = cos(9) cos(@) — sin(@) sin(@) 


and 
sin(@ + @) = cos() sin(¢) + sin(9) cos(@). 


For convenience, we also introduce the following shorthand. 


The Fundamental Theorem of Algebra 209 


Definition 6.6. cis(@) is a shorthand for the complex number cos(@) + isin(@). Therefore, the 
polar form of any a € C can be written as |a\cis(@). 


m Examples 


i) 
Q 
n 
aN 
| N 
q 
So” 
II 
‘S) 
ie 
ro) 
e) 
Nn 
| 
I, 
4 2S 
So 
+ 
a 
5 
LS 
q 
ae 
Se” 
II 
on 
<x 
Nl 
| 
on 
S 


We can now prove the following. 


Theorem 6.7—DeMoivre’s Theorem. /f a= |a|cis(9) and B =|B|cis(@) are two complex 
numbers written in polar form, then 


a-B=|a||Blcis@+ ). 
Furthermore, ifn € Z then 
a" =|a|"cis(n-8). 


Intuition. This theorem indicates how simple and natural multiplication in C is when we 
view complex numbers in polar form. The theorem says that when we multiply complex 
numbers in polar form, we simply multiply their lengths and add their angles. The proof is a 
direct and straightforward application of the addition formulas for the cosine and sine that we 
stated in Lemma 6.5. Also, observe how easy it is to find inverses in polar form. If 

a= |a|cis(@) ~ 0, then letting nm = — 1 in DeMoivre’s Theorem immediately tells us that 

a! =|a|~!cis(—6). 


Proof: If a= |a|cis(@) and B =|A|cis(@), then 
(4) a - B= (|a|cis())(|Blcis(p)) = |a||B| (cis) - cis()) 
However, 


cis(0) -cis(@) = (cos (6) + isin (9)) (cos (@) +isin (@)) = 
(cos (8) cos (#) — sin (9) sin (@)) +i (cos (8) sin (#) + sin (9) cos (P)) . 


Applying Lemma 6.5 to the preceding equation reveals to us that 


cis(9) -cis(@) = cos(9+ d) +isin(6+ ¢) =cis(O+ @). 


210 Chapter 6 


Plugging this result into (4) results in 


a+ B=|a||Blcis(+¢), 
as desired. 


To begin the proof of the second part of this theorem, let T be those natural numbers such that 
a” =|a|"cis(n -0). We would like to show that T=N, and we will proceed by Mathematical 
Induction. To see that 1 € 7, we merely observe that a! =|a|'cis(1- 6). Therefore, we now 
need to show that if k is some natural number belonging to T, then k + 1 also belongs to 7. 


Since k € T, we have ot = |a|*cis(k -). Applying the first part of this theorem along with the 
fact that k € T, we now have 


at! — a-ak = (a|cis (0) (lal*cis (k-0)) = 
(|| -|a\*) (cis @+k-0)) =lal** cis (k +1) -8). 
Thus, k + 1 does indeed belong to T. 


To conclude the proof of the second part of this theorem, we now need to verify that 

a” =|a|"cis(n- 6) when n =0 or n is a negative integer. Our result certainly holds when n =0 
because, in this case, both w” and |a@|"cis(n - @) are equal to 1. Finally, if n is a negative integer, 
then —n is a positive integer and our previous argument tells us that a~” = |a|~"cis(—n - 6). 
We now have 


a". (|a|"cis (n-8)) = (la|~"cis (—n - 6) - (|a|"cis (n- 8) 
= (lal |or|") (cis ((—n +) 6)) =1. 


Observe that the previous equation tells us that 1=a~” - (|a|"cis(n- @)). Multiplying this 
equation by @” results in w” = |a|"cis(n -@), as desired. 


m Examples 
iIcis (=) -6cis (=) = 66cis (=), 

scis (=) .Acis (27) =20cis (7) —20 (-3 -5) = — 10/3 — 104, 

(Fata) = (G45 

(eh) = (a0) a 


The Fundamental Theorem of Algebra 211 


: 7 20 . 20 
d—)%= (v2 (<is (3) n)) - (v2) cis(357) =2!°cis(z) = — 2!° = — 1024, 


These examples indicate that when raising a complex number to a positive integer power, it is 
often to our advantage to first write the number in polar form. In the examples, we saw that 
5 + i+, and “F = ix are both square roots of i. We also saw that 1 —i is a 20th root 

of —1024. In light of these examples, we will take another look at DeMoivre’s Theorem with 
the goal of finding roots of complex numbers. 


DeMoivre’s Theorem says that raising a complex number to the nth power entails raising the 
length to the nth power and multiplying the angle by n. Conversely, it looks like in order to 
find an nth root of a complex number, we would need to take an nth root of the length and then 
divide the angle by n. Fortunately, the Intermediate Value Theorem (Theorem 4.4) guarantees 
that every positive real number has a positive nth root. Therefore, it appears that in C there are 
no obstacles to finding nth roots. This then raises the question as to how many nth roots a 
number can have in C? Given a € C, the nth roots of @ are precisely the roots of the 
polynomial x” — a. In Chapter 12, we will show that any polynomial of degree n can have at 
most n roots in a field. Thus, the most nth roots an element of a field can have in that field is n. 
For example, in a field a number may have no square roots and can have as many as two 
square roots. In particular, —1 has no square roots in Q and R, but it has two square roots in C. 
Shortly we will see that whereas 1 has two fourth roots in Q and R, it has four fourth roots 

in C. In fact, as we will now see, the number of nth roots every nonzero complex number has 
in C is indeed n. 


Theorem 6.8. Let a =|a|cis(@) be a nonzero complex number written in polar form and let 
1 . 
ne€N. If we let |a|" denote the positive real nth root of \a|, then 


1, (0 1, (O+1-27 
a9 = la|"cis| —}, a; =|a|"cis | ———— ]...., 
n n 


1, (0+ ]-20 1, (0+(—-1)-2n 
at; = |or|* cis | ———— ], ..., @-1=|a|"cis | ——————— 
n n 


are n different nth roots of a in C. 


212 Chapter 6 


Proof. We must first show that if 0 < j <n —1, then each a; mentioned above is indeed an 
nth root of a. To show this, we can apply the second part of DeMoivre’s Theorem to see that 


i 1, (0+ ]-2n n 
aj" = | |a|*cis a = 


(lal*)° cis G (A) = |a|cis(@+ j-27) =|a|cis(@) =a. 


To see that ap, @1,..., @,—, are all different, we review some unit circle trigonometry. If ¢ is 
an angle in radian measure lying in the interval [0, 277), then the point (cos(@), sin(@)) lies on 
the unit circle x7 + y*=1. 


(cosd, sind) 


Furthermore, if ¢;, 2 are two angles in the interval [0, 27) that are not equal, then 
(cos(@ 1), sin(@;)) and (cos(¢2), sin(@2)) are different points on the unit circle. Thus, the 
complex numbers cis(¢)) and cis(¢2) would be different. 


(cosy, sind;) 


The Fundamental Theorem of Algebra 213 


In light of this, since 


are n different angles that belong to the interval [0, 277), we can conclude that the complex 
numbers 


(5) cis(Q), cis (=) cis (=), ..., Cis (<*). aig Gis (c—*) 
n n n i 


are all different. 


Next, observe that if y is a nonzero element of C and if 6), 62 € C are not equal, then 
y- B, and y - B are also not equal. This is easy to check, for if y- 6; =y- Bo, then 
y-!-y- Bi =y7!-y- Bo, which would immediately imply that 6, = By. Note that this 
is not so much a fact about C but is a fact that holds in all fields. 


If we multiply the n different complex numbers in (5) by the nonzero complex number 
ja ncis(®), we obtain a, M1, ..., @n—1. However, the argument above tells us that they are all 
different, thereby concluding the proof. 


m Examples 


Finding the 4 fourth roots of 1. In polar form, 1 = Icis(O), so Theorem 6.8 tells us that 
the 4 fourth roots of | are 


1, (2-20 : 
1l4cis | —— }) =1-cis(z)= — 1, 
4 
1, (3-20 (32 : 
14cis | —— }] =1-cis| — ]) =—-1. 
4 2 


Thus, exactly 2 of the 4 fourth roots of | lie in R. 


www.elsevierdirect.com 


214 Chapter 6 


Finding the 6 sixth roots of 64. In polar form, 64 = 64cis(0), so Theorem 6.8 tells us that 
the 6 sixth roots of 64 are 


1-2 1 3 
644 cis (=) =2-cis (=) n2(5+2] =1+iVv3, 
2-2 2 1 3 
644 cis (=) =2-cis (=) = (-34 *) = = p73, 


644 cis (=) =2-cis(z) =2(—1) = —2, 
644 cis (=) =2-cis (=) =2 (-3-F Som, 


1, (5-20 _ (51 I. «3 
644 cis (7) =2-cis (F) n2(3-<] =14iV73. 


For other problems involving nth roots, we may not be able to simplify the answer as 


much as we did in the previous two examples, as we may not be able to compute the 


F 1 ‘ 6+ 7-2 
exact value of either |a|” or cis (42). 


Finding the 3 cube roots of 10i. In polar form, 10i= 10cis(>), so Theorem 6.8 tells us 
that the 3 cube roots of 10i are 


103 cis (2) = 105cis (=) —103 

3 6 
Ey 1-27 3 

lobcis (2) = 103 ci ee 


49.9 
iis (2) —103cis (F) 103 (j= 7107; 


Finding the 7 seventh roots of 1. In polar form, 1 = Icis(0), so Theorem 6.8 tells us that 
the 7 seventh roots of | are 


: . (1-22 . (2-20 . (3-20 . (4-20 
cis(0) =1, cis | —— ]}, cis | —— ], cis | —— }, cis | —— }, 
7 7 7 7 
_ {5-20 . (6-20 
cis | —— ], cis | —— }. 
7 7 


However, this is as far as we can simplify our answer as we cannot compute, by hand, the 


2 if 3 ; 
values of cos (55) or sin (52), where 1 < j <6. 


www-.elsevierdirect.com 


V3.1 108-V3 | 108 
—+i- }) = ———_ + 1—_ , 
oD 2 2 


Lop 

Q 

vn 
Os 

Nn 
a] s 
Na 

II 

jt 

S 

| 
NR 
“————” 

| 

— 

5 
ie) 

oa 

jes 
N| oO 


The Fundamental Theorem of Algebra 215 


6.4 A Proof of the Fundamental Theorem of Algebra 


Let us now focus our attention on the Fundamental Theorem of Algebra. In Chapter 12, we 
will see that there are several equivalent ways to state this famous theorem. However, the 
version we will prove in this chapter asserts that every polynomial of degree at least 1 with 
coefficients in C must have a root in C. 


Before proving a theorem, one usually needs to come across enough examples and other types 
of evidence to develop an intuitive feeling that the theorem has a chance to be true. Therefore, 
before presenting a proof of the Fundamental Theorem of Algebra, let us look at some of the 
evidence that provides us with an intuitive feeling that the Fundamental Theorem of Algebra 
might be true. First, in Corollary 4.9, we showed that every polynomial of odd degree with 
coefficients in R must have a root in R. Then, in Theorem 6.8, we showed that any 
polynomial of the form x” — a, where a € C, has a root in C. Thus, Corollary 4.9 and 
Theorem 6.8 indicate that there are two large classes of polynomials that must have roots in C. 
Certainly, this is not a proof of the Fundamental Theorem of Algebra, but it is sufficient 
evidence to indicate that we should look for a proof. 


Traditionally, when dealing with functions of a real variable, we usually denote the variable 
as x. Recall that every a € C can be written as a=a-+ bi, with a, b € R. Therefore, if a 
variable ranges through the complex numbers, we can think of representing it as x + yi, where 
x, y are variables ranging through the real numbers. Thus, we traditionally write a complex 
variable as z and represent it as z=x-+ yi, where x and y are real variables. 


A typical polynomial with coefficients in C would be 
p(z) =3z" +:5iz+ (6—2i). 
If we replace z by x + yi, we obtain 
p(2) = p(xt+ yi) =3(x4+ yi)? +5i(x+ yi) + (6-21) = 
((3x? —3y”) + 6xy-i) + (—Sy +5x-i) + (6—2i) = 
(3x? —3y” —5y+6)+ (6xy+5x—2)i. 


As in this example, if p(z) = p(x+ yi) is any polynomial with complex coefficients, we can 
always collect the real and complex parts of p(z) in terms of x and y. Therefore, we can write 


PZ)= P+ yi) = f(x, y) + 8, yi. 
In addition, it immediately follows from Lemma 6.3(a) that 


Ip(OI? =(fa, y) + g(x, yi) (FQ, y) + 8(x, y)i)* = 
(f(x, y) +.9(x, yi) (fx, y) — g(x, yi) = f(x, y)* +.a(x, yy. 


216 Chapter 6 


Furthermore, note that the only operations performed in finding f(x, y) and g(x, y) are 
addition, subtraction, and multiplication. Therefore, f(x, y) and g(x, y) must be 

polynomials with coefficients in IR. This observation, which will be used later in this chapter, 
can now be recorded as 


Lemma 6.9. Let p(z) be a polynomial with coefficients in C. If we let z=x+ yi, then 


P(Z) = p(xt yi) = f(x, y) + a(%, y)i, 


where f(x, y), g(x, y) € R[x, y], the set of polynomials with real coefficients in the variables x 
and y. Furthermore, |p(z)|" = f(x, y)? + g(x, y)*. 


When we proved Corollary 4.9, we informally discussed the fact that if f(x) € R[x] has degree 
at least 1, then | f(x)| — +00 as |x| ~ +00. One of the tools we will need to prove the 
Fundamental Theorem of Algebra is the analogous result for polynomials with coefficients in 
C. When we state this fact in the next lemma, it will look rather technical, but you should 
convince yourself that it is a formal way of saying that | p(z)| — +00 as |z| > +00. 


Lemma 6.10. Let p(z) be a polynomial with degree at least 1 with coefficients in C. For every 
real number M > 0, there exists a real number R > 0 such that | p(z)| > M, whenever |z| > R. 


Intuition. Once again, let us consider the polynomial p(z) = 3z7 + 5iz+ (6 — 2i). Then 


Si 6—-2i 
peo=2(3+ 2+ ; *) 
Zz Z 


When |z| is large 2 and ae are close to 0. Therefore, the term 3 + 2 + 


close to 3 as we wish by letting |z| be large. 


6—2i 


= can be made as 


As a result, when |z| is large, p(z) is approximately the same as the polynomial 32”. Since 
|3z7| > +00 as |z| > +00, it follows that | p(z)| also goes to +00 as |z| > +00. 


Proof. We can write 
P(Z)=OnZ" +Op—12" | +++ +az+00, 


where the a; € C,n € N, and a, 4 0. Therefore, 


On=1 Tr) 
pemanc'(14 Stee +8) 


The Fundamental Theorem of Algebra 217 


Next, let T be the maximum value of aL , where j ranges from 0 ton — 1. If z € C such that 
|z| > 2nT and |z| > 1, we have 
~ 1 1 1 i a 
ae Qj Qj 
(6) CE ef ec eee FC ey 
Ad I Fe ec en Ul 2nT 2n 
Furthermore, if we let 
An-1 ay ao 
an an Qn 
B Se al + zw 


then 


P(Z) =Gnz" (1+ B) 
and Lemma 6.3(d) and (6) tell us that 


An-1 a ao 


Ay Qn 
S| eet 


Since |B] < 7 when we write B =a-+ bi, we have 
1 
lal = Va? < Va? +b? =|f| < 


Therefore, a is areal number that lies between 5 and —5. As aresult, | +a> 5 and it 
follows that 


[1+ | =|1+ (a+bi| =| +a) +bi]|=/0 +a)? +2 = 
VOFaP=I1+al> 5. 


Remembering that |z| > 2nT and |z| > 1, we now have 


lanl 
(7) Ip(Z)| = lonz" (1+ B)| = lonl|z"11 + Bl = lenllzi| t+ Bl > i Iz|. 
Finally, suppose we are given a real number M > 0. Then let R be a real number such that 


R>2nT, R>1,and R> at 
It now follows from (7), that if |z| > R, we have 

|On| la,| 2M 
|z| > . — 
2 2 |an| 


Thus, | p(z)| > M whenever |z| > R, as desired. 


| p(z)| > 


218 Chapter 6 


In Chapters 1 and 4, we remarked that the real numbers are not a purely algebraic object. Since 
the complex numbers are constructed directly from the real numbers, they too are not a purely 
algebraic object. Therefore, any proof of the Fundamental Theorem of Algebra must use 

some results related to calculus. The proof that is the most algebraic requires not only the 
Intermediate Value Theorem but also more group theory and Galois theory than we will need to 
prove the insolvability of the quintic. On the other hand, we will soon present a relatively short 
and easy proof that applies some results on continuous functions of one and two real variables. 
Since the continuity of function of two variables is a somewhat subtle concept, and since we 
only need to apply these results to polynomials, we will only state them for polynomials. 


Proposition 6.11—The Extreme Value Theorem 


(a) Let f(x) be a polynomial with real coefficients. Then f(x) has a minimum and maximum 
value on the interval a < x < b. 


(b) Let g(x, y) be a polynomial in two variables with real coefficients. Then g(x, y) has a 
minimum and maximum value on the interval x2 + y’ <R. 


We will not be providing a proof of Proposition 6.11, as it takes us rather far afield. However, 
hopefully you are familiar and comfortable with the Extreme Value Theorem from your 
courses in calculus. Proving the Fundamental Theorem of Algebra means that it will be 
available for us to use at any point throughout the remainder of this book. Indeed, the 
Fundamental Theorem of Algebra will allow us to reduce very abstract situations to the more 
concrete world of complex numbers, and this will greatly simplify the development of the 
Galois theory needed to prove the insolvability of the quintic. 


Theorem 6.12—The Fundamental Theorem of Algebra. Jf p(z) is a polynomial of degree 
at least 1 with coefficients in C, then p(z) has a root in C. 


Intuition. We will proceed with a proof by contradiction, and, as we mentioned in 
Chapter 2, such proofs are often lacking in intuition. Fortunately, for this particular proof by 
contradiction, it is still fairly easy to present an overview of the main ideas. 


Using Lemmas 6.9 and 6.10 along with the Extreme Value Theorem, we first show that there 
exists some zo € C that minimizes | p(z)|. If p(z) has no root in C, then | p(zo)| is some 
positive real number M. We then use p(z) to construct a polynomial q(z) such that the 
minimum value of |g(z)| is also M, but it occurs at 0. Next, g(z) is used to construct a 
polynomial r(z) with the properties that the minimum value of |r(z)| is 1, this minimum occurs 
at 0, and the constant term of r(z) is 1. 


Finally, r(z) is used to construct a polynomial H(z) that satisfies the same three conditions as 
r(z) but also has the property that 


A(z)=1—2 +2*'!5(z), 


The Fundamental Theorem of Algebra 219 


where s(z) is yet another polynomial with coefficients in C. However, we can then use the 
Extreme Value Theorem to show that there exists a real number a such that | H(a)| < 1. Since 
the minimum value of | H(z)| is 1, this contradiction shows that the polynomial p(z) must have 
had a root in C. 


Proof. By way of contradiction, suppose p(z) has no root in C. We will use p(z) to construct a 
polynomial H(z) that has two incompatible properties. 


Let N =|p(0)|; by Lemma 6.10 there is a positive real number R such that | p(z)| > N, 
whenever |z| > R. Letting z=x-+ yi, Lemma 6.9 says that |p(z)|? is a polynomial in x and y 
with coefficients in R. Therefore, the Extreme Value Theorem asserts that | p(z)|* has a 
minimum value on the interval x? + y? < R. Since |z|* =x* + y’, this is equivalent to saying 
that | p(z)|? has a minimum value on the interval |z| < R. If we let zo be the point in the 
interval |z| < R that minimizes | p(z)|?, then it is easy to see that zg is also the point in the 
interval |z| < R that minimizes | p(z)|. However, when |z| > R, we know that 


IPI > N=|p)| = |po)|- 


Thus, | p(zo)| is the minimum value of | p(z)| throughout all of C, and we let M =| p(zo)|. 
Observe that since p(zo) 4 0, we know that M 40. 


Next, let g(z) = p(z + Zo). Certainly g(z) is also a polynomial with coefficients in C that has 
the same degree as p(z). Furthermore, every value of g(z) is also a value of p(z), therefore the 
minimum value of |g(z)| is also M but it now occurs at 0 as 


IgO)| =|pGo)l =. 


If we let a be the constant term of g(z), then 
a =4(0) £0. 


Therefore, we can now consider the polynomial r(z) = ge) | We can see that r(z) is yet another 
polynomial with coefficients in C that has the same degree as p(z). Furthermore, for any 
z€C, |r(z)|= sae Therefore, the minimum value of |r(z)| also occurs at 0 and this minimum 
value is 


|g)! laol 


lao] ~— || 


I7(0)| = 


In addition, note that 1 is also the constant term of r(z). 


Since r(z) has degree at least 1, we can let k be the smallest positive integer such that the 
coefficient of z* in r(z) is nonzero. Therefore, we can write 


r(zy=1+ Bz + terms of higher degree. 


220 Chapter 6 


Since B 40, we know that B7! € C, and Theorem 6.8 tells us that there is some y € C such 
that y* = — B~!. We can now let H(z) =r(yz) and observe that 


H(z) =r(yz) =1+ B(yz)‘ + terms of higher degree = 
1+(p)(—B7!)z‘ + terms of higher degree = 1 — zk + terms of higher degree. 


Therefore, we can write 
Ag) =1-2 4. 2*!s(2), 


where s(z) is a polynomial with coefficients in C. Every value of H(z) is also a value of r(z) 
and since 


H(0)=r(y-0)=r(0)=1, 
it follows that the minimum value of | H(z)| is also 1. 


Let us now turn to the polynomial s(z). By Lemma 6.10, |s(z) isa polynomial with real 
coefficients in two real variables. Therefore the Extreme Value Theorem tells us that |s(z)|? 
has a maximum value on the interval x7 + y* < 1. This immediately implies that |s(z)| also has 
a maximum value on the interval x? + y’ < 1, and we let T denote the maximum value of 
|s(z)| on the interval x7 + y? < 1. Finally, let a be a real number in the open interval (0, 1) such 
that aT < 1. In this case, we can see that 1 — a‘, 1 —aT, and a are all positive real numbers. In 
particular, a«(1 — aT) > 0, and this implies that 1 — a*(1—aT) < 1. 


At first, it may not be clear why the previous inequalities are useful. However, if we combine 
these inequalities with the formula H(z) = 1 — z* + z*+!s(z) and the triangle inequality, we see 
that 


|H(a)|=|1—a* +a**!s(a| < |1—a*| + Ja*t!s(a)| = 1 —a* + Jal! |s(a)| < 
l=@ ta Pa1 ad al) 1, 


As a result, despite the fact that the minimum value of | H(z)| is 1, we have shown that 
| H(a)| < 1. This is a contradiction, so the polynomial p(z) must have a root in C, thereby 
proving the theorem. 


The Fundamental Theorem of Algebra is a beautiful and important theorem. However, we 
should keep in mind what the Fundamental Theorem of Algebra says and what it doesn’t say. It 
says that any polynomial of degree at least 1 with coefficients in C must have a root in C. But 
it is important to note that although it tells us of the existence of a root, it does not provide us 
with an algorithm or formula for finding a root. This is quite different from the situation with 
the Intermediate Value Theorem in Chapter 4. Recall that the Intermediate Value Theorem 
could not only be used to prove the existence of a root of certain polynomials but could also be 
used to compute the root to any desired degree of accuracy. In no way does this detract from 


The Fundamental Theorem of Algebra 221 


the importance or the beauty of the Fundamental Theorem of Algebra. However, remember 
that it is always important to understand exactly what a theorem says and what it does not say. 


Much of this chapter dealt with viewing the complex numbers as equivalence classes of 
arrows in the plane. Hopefully this approach made the complex numbers more “real” and less 
confusing to you. In abstract algebra, we try to understand abstract objects. In trying to better 
understand these objects, it is quite common and very natural to try to view these objects as 
concretely as possible. In fact, a very active field of algebraic research today is called 
representation theory, and it deals with trying to view and understand algebraic objects in 
terms of their relationships to more concrete objects. 


Toward the beginning of Chapter 5, we remarked that it is much simpler to construct the 
complex numbers from the real numbers than it is to construct the real numbers from the 
rational numbers. At this point, we have enough experience with these two constructions to 
reflect back on them. Recall that given the rational numbers, we can construct the real numbers 
as equivalence classes of Cauchy sequences of rational numbers. The addition or 
multiplication of two real numbers therefore involves adding or multiplying all the terms from 
two Cauchy sequences. It takes an understanding of convergence simply to justify that 
addition and multiplication of equivalence classes of Cauchy sequences are well defined. 


On the other hand, every complex number can easily be written in terms of two real numbers 
using either rectangular or polar form. Addition and multiplication in C are easily seen to be 
well defined. Furthermore, although checking that C satisfies the nine properties of a field can 
occasionally be somewhat tedious, the work involved is fairly straightforward, and it is easy to 
see that C inherits these nine properties directly from R. Therefore, advancing from an 
understanding of the basic algebraic properties of R to a similar understanding of C is a step 
forward, but it is not a difficult or intimidating leap forward. 


In this chapter, we chose to represent complex numbers as equivalence classes of arrows in the 
plane. However, there are other ways to represent C in terms of R. We will now briefly 
mention two of these other approaches of representing C. Both of these approaches are more 
algebraic and less geometric than using arrows in the plane, and both use ideas and techniques 
that we have not yet fully discussed. For these reasons, it was preferable to represent C using 
arrows in the plane. 


In the first of these additional approaches, we let R[x] denote the set of polynomials with 
coefficients in IR. We then define ~ on the set R[x] as f(x) ~ g(x) precisely if f(x) — g(x) isa 
multiple of x* +1. It is easy to check that ~ is indeed an equivalence relation. Observe that the 
equivalence class [x7] is the same as the equivalence class [—1], as x? —(—-l) is certainly 
divisible by x? + 1. It turns out that we can associate every a+ bi € C with the equivalence 
class [a+ bx]. In this representation, the equivalence class [x] has the property that its square 
is [—1]. 


222 Chapter 6 


For those of you familiar with matrix multiplication, the second additional approach deals 


; f@ =b 
with 2 x 2 matrices as we can associate every a+ bi € C with the matrix (; ) In this 
a 


representation, the matrix ( 0 


= is the additive inverse of the multiplicative identity and 


the matrices 6 fa and ( : 1 0) both have the property that their square is & A 


Exercises for Sections 6.3 and 6.4 


For exercises 1-8, let a=cis (3) , B=9cis (4) ,V=Tcis (2) ,o= 104cis (2). Unless your 
answer simplifies into the form a or the form bi, where a, b € R, you should leave your answer 
in polar form. 


1. Compute each of the following: a“, a’, a 

Compute each of the following: a~*, a*, a™~. 

Compute each of the following: wB, (a@B)*, (ap)’°. 
Compute each of the following: (wB)~!, (a@B)*, ((a@p)*)7!. 
Compute each of the following: y?,_ y°, _y!. 
Compute each of the following: By, (By', (By)*. 


Compute each of the following: 5°, 5°, 671. 


eo ND HW FPF Y BB 


Compute each of the following: y8, (8), yd. 


In exercises 9-16, you should convert the complex number into polar form before doing any 
other computations. Express your answers in both polar and rectangular form. 


9. 4p” 


© (oat) 


11 <@=3i)-" 
1 V3; 60 
12 (3-91) 


13. (2V3+2i)!? 
14, (—1—4/37)3 
15. (-5+5¥/3i)? 


The Fundamental Theorem of Algebra 


16. (-B- Liye 


In exercises 17—26, leave your answers in polar and rectangular form. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


(a) 


(b) 


Find the 4 fourth roots of —1. 


Find the 4 fourth roots of —81. 


Find the 4 fourth roots of —25. 


Find the 4 fourth roots of —7. 


Find the 6 sixth roots of —1. 


Find the 6 sixth roots of —64. 


Find the 6 sixth roots of —8. 


Find the 6 sixth roots of —99. 


Find the 12 twelfth roots of 1. 
Find the 12 twelfth roots of 5!2. 


Find the 12 twelfth roots of 64. 
Find the 12 twelfth roots of 75. 
V3 


Find the two square roots of —5 ab 


Find the two square roots of —18 — 18/3i. 
Find the two square roots of —5 — 5/3i. 
Find the two square roots of —10./3 — 30i. 
Find the three cube roots of i. 

Find the three cube roots of 1000i. 

Find the three cube roots of 125i. 


Find the three cube roots of 10. 


In exercises 27-34, leave your answers in polar form. 


27. 


(a) 
(b) 


Find the 5 fifth roots of —i. 
Find the 5 fifth roots of —441. 


223 


224 Chapter 6 


28. (a) Find the 5 fifth roots of —243i. 
(b) Find the 5 fifth roots of —70i. 

29. (a) Find the 3 cube roots of at zi. 
(b) Find the 3 cube roots of 500 + 500i. 

30. (a) Find the 3 cube roots of 53 + 533i. 
(b) Find the 3 cube roots of $+ {i. 

31. (a) Find the 7 seventh roots of cis (37). 
(b) Find the 7 seventh roots of 20cis (32 if 


). 


I 
(b) Find the 7 seventh roots of 23cis (#2). 


in 
32. (a) Find the 7 seventh roots of 6’cis (37 


33. (a) Find the 5 fifth roots of cis (). 
(b) Find the 5 fifth roots of 11cis (22). 
34. (a) Find the 5 fifth roots of 32cis (=). 
(b) Find the 5 fifth roots of 100cis (3). 
35. (a) Find a1, a2, a3, a4 € C such that x4+1= (x —a@1)(x — a2) (x — a3) (x — 4). 


(b) Use part (a) and Proposition 5.11 to write x4 + 1 as a product of two quadratic 
polynomials with real coefficients. 


36. (a) Find a, a2, a3, a4 € C such that x4 +9 = (x —a@1)(x — a2) (x — 3) (x — a4). 


(b) Use part (a) and Proposition 5.11 to write x4 +9 as a product of two quadratic 
polynomials with real coefficients. 


37. Use your answer to part (b) of exercise 35 to write x* — 1 as a product of polynomials 
with real coefficients that are either quadratic with no real roots or are linear. 


38. Use your answer to part (b) of exercise 36 to write x* — 81 as a product of polynomials 
with real coefficients that are quadratic with no real roots or are linear. 


39, Find the 6 roots in C of x® — 28x? + 27. 
40. Find the 6 roots in C of x® — 133x* + 1000. 
41. Find the 8 roots in C of x® — 13x44 36. 

42. Find the 8 roots in C of x® —41x* + 400. 


The Fundamental Theorem of Algebra 225 


43. Find the 10 roots in C of x!° —33x° +32. 
44, Find the 10 roots in C of x!°—5x° +6. 
For exercises 45-51, please first read the following: 


DeMoivre’s Theorem tells us that (cos(n@) + 7 sin(n@)) = (cos(@) +isin(9))”, for alln € N. 
On the other hand, we can also compute (cos(@) +i sin(@))” a second way by expanding it out 
using either the distributive law or the binomial theorem. If we compare the real and complex 
parts of our two expressions for (cos(@) +isin(@))”, we obtain formulas for cos(”@) and 
sin(n@) in terms of cos(@) and sin(@). In the next seven exercises, we derive and examine some 
of these formulas. To do this, we will often need to use the identity sin? (0) +cos?(6) =1. 


45. Expand the expression (cos(6) + isin(6))° to find polynomials f(x), g(x) such that 
cos(30) = f(cos(@)) and sin(30) = g(sin(@)). 


46. Expand the expression (cos(6) + isin(6))‘ to find polynomials f(x), h(x) such that 
cos(46) = f(cos(@)) and sin(46) = cos(@) - h(sin(@)). 


47. Expand the expression (cos(6) + isin(6))° to find polynomials f(x), g(x) such that 
cos(50) = f(cos(@)) and sin(50) = g(sin(@)). 


48. Show that if n € N, then there exists a polynomial f(x) such that cos(n@) = f(cos(@)). 


49. Show that if n € N is odd, then there exists a polynomial g(x) such that sin(n@) = 
g(sin(@)). In addition, show that if 1 is even then there exists a polynomial A(x) such that 
sin(n@) = cos(@) -h(sin(@)). 


In light of exercises 48 and 49, we know that if n is an odd integer then there exist polynomials 
f(x), g(x) such that cos(n@) = f(cos(@)) and sin(n@) = g(sin(@)). In the next two exercises, we 
examine the relationship between f(x) and g(x). We will need to apply the fact that 

sin(@) = cos (% — 6). 


50. Suppose n € N such that n =4k+ 1, for some k € Z and let f(x), g(x) be the polynomials 
in the formulas cos(n@) = f(cos(@)) and sin(n@) = g(sin(@)). Prove that f(x) = g(x). 
(Hint: Replace 6 by 7 — @ in the formula cos(n@) = f(cos(@)).) 


51. Suppose n € N such that n =4k + 3, for some k € Z and let f(x), g(x) be the polynomials 
in the formulas cos(n@) = f(cos(@)) and sin(n@) = g(sin(@)). Prove that f(x) = —g(x). 
(Hint: Replace 6 by a — @ in the formula cos(n@) = f(cos(@)).) 


In exercises 52-57, replace z by x + yi in the polynomial p(z) and then find polynomials 
F(x, y), g(x, y) with real coefficients such that p(z) = f(x, y) + g(x, y)i. 


52. p(z)=(2—3i)z+ (44+ 111) 
53. p(z)=(4—i)z+ (642i) 


226 Chapter 6 


54. p(z)=727+(1+i)z-3 

p(z)= (5 + 3i)27 + 8z + (-11 + 52) 
p@aaetetztl 

p(z) = (5 — 6i)z3 — 42? + Tiz + (31 — 8i) 


55. 
56. 
57. 
58. 


Let R[x] denote the set of polynomials with coefficients in R. Then define the relation ~ 
on the set R[x] as f(x) ~ g(x) precisely if f(x) — g(x) is a multiple of ae, 


(a) 
(b) 


(c) 


(d) 
(e) 


(f) 
(g) 


(h) 


Show that ~ is an equivalence relation. 


If we define the addition of equivalence classes as [ f(x)] + [g(x)] =Lf() + g()], 
show that addition is well defined. 


If we define the multiplication of equivalence classes as [ f(x)]-[g(x)] = 
[ f(x) - g(x)], show that multiplication is well defined. 


Show that [0] is the additive identity and [1] is the multiplicative identity. 


Show that every equivalence class contains exactly one polynomial of the form 
a+ bx, where a,b ER. 


Show that the equivalence class [x] has the property that its square is equal to [—1]. 


‘ : a b ; 5 : : 
If either a or b is nonzero, show that EE: = ad is the multiplicative inverse 
of [a+ bx]. 
How similar does the set of equivalence classes seem to be to the set of complex 


numbers? Which equivalence class (or equivalence classes) seems to correspond to 
the complex number i? 


The Integers Modulo n 


Before beginning this course, the number systems you were probably the most comfortable 
and familiar with were Z and Q. Similarly, the sets of polynomials you were probably the most 
comfortable and familiar with were Z[x] and Q[x]. Therefore, in Chapter 9, we will examine 
Z{(x] and Q[x]. This will lay the foundation for a more general and detailed study in 

Chapter 12 of polynomials with coefficients in any field. But first, we use our experience with 
Z and equivalence classes to introduce a new collection of commutative rings and fields 
known as the integers modulo n and denoted as Z,,. 


There are many reasons for introducing commutative rings of the form Z,, at this point 
in time: 


1. They provide us with many concrete examples of fields, commutative rings, and groups. 


2. They are useful tools for proving results about the existence of roots and the factoring of 
polynomials in Z[x] and Q[x]. 


3. An understanding of the construction of Z, from Z will be very useful in understanding 
the solution, in Chapter 17, of an important problem that deals with the existence of roots 
of polynomials with coefficients in more general fields. 


7.1 Definitions and Basic Properties 


When we define Z,, we will not be defining one commutative ring but will actually be 
defining an infinite collection of rings. In particular, the commutative rings 

Zin, £3, Z4,...,Z99,... Will all be different. To define Z,, we revisit an equivalence relation 
that appeared in Chapter 4. Once again, we will not be looking at a single equivalence relation 
but actually an infinite number of equivalence relations. If n > 1 is an integer, we let ~, 
denote the equivalence relation where, for all a, b € Z, we say that a ~, b precisely when n 
divides a — b. Note that for every n, we obtain a different equivalence relation. 


Copyright © 2010 by Elsevier Inc. All rights reserved. 22 7 


228 Chapter 7 


Examples 


1 om >; 1 £6 aye 2~4 —14, 2~8 —14, 2 a5 —14, 2 43 —14, —5 ATI lige —5 £7 17, 
—5 ~2 17, —5 £417. 


Definition 7.1. [fn > 1 is an integer, we let Z,, denote the equivalence classes of Z 
corresponding to the equivalence relation ~,. For eacha € Z, we let [a], denote the 
equivalence class containing a. Zp, consists of n equivalence classes and, for convenience, we 
often refer to them using the names [0], [Wn (2])n, .--, (2 -—2]n, In—- Un. 


Definition 7.1 mentions the fact that Z, consists of exactly n equivalence classes. To see this, 
observe that if a, b € Z, then [a], = [b],, precisely if n divides a — b. However, if 

0 <a,b<n-—1, then the only way n can divide a — b is for a and b to be equal. In light of 
this, the equivalence classes [0],, [Ln [2]n, .--, [2 —2]n, [v — 1], must all be different. 
Therefore, Z, consists of at least n different equivalence classes. On the other hand, the 
division algorithm asserts that if a € Z, then there exist g,r € Z such that 


a=q:-n+r and O0<r<n-l. 


This tells us that n divides a—r, so [a]n = [r]n. But since 0 < r <n —1, this means that [a], is 
equal to one of [O],, []n, [2]n, -.-, [" —2]n, [2 — 1]. As a result, Z, does indeed consist of 
exactly n equivalence classes. 


Even though Z,, consists of exactly n equivalence classes, it is important to be aware that each 
class has an infinite number of names. Traditionally, the equivalence classes obtained from ~, 
are called congruence classes, and if a, b belong to the same class, we say that a and b are 
congruent modulo n, and we write this as 


a=b(modn). 


m Examples 


Z4 consists of four equivalence classes and, for convenience, we usually denote them as 
[O]4, [1Ja, [2]4, [3]4. However, each class has an infinite number of elements and any 
element in a class can be used as the name of the class. For example, 


4 


[O]4 = [4]4 = [—4]4 = [8]4 = [—8]4 = [12] 4 = [—12]4 = [16]4 = [—16]4 = - -- 
[V4 = [5]4 =[—3]4 = [9]4 =[-7]4 = [13]4 = [-1]4 = 174 = [—-15]4 = 

[2]4 = [6]4 = [—2]4 = [10]4 = [—6]4 = [14]4 = [—10]4 = [18]4 =[—-14]4 =--- 
[3]4 =[7]4 =[—H4 = (114 = [—S]4 = [15]4 = [—9]4 = 119]4 =[-13]4 =---- 


www-.elsevierdirect.com 


The Integers Modulon 229 


Similarly, Zs consists of five equivalence classes, and, for convenience, we usually denote 
them as [O]s, [1]s, [2]s, [3]s, [4]s. Each class has an infinite number of elements, and 
therefore each class has an infinite number of names. For example, 


O]s = [S]5 =[—S]s = [10]5 = [—10]5 = [15]5 =[—15]5 = [20]5 = [—20]5 = 


[0] ] ] ] 
[1]s = [6]s =[—4]s = (1115 = [—9]5 = [16]5 = [—14]5 = [21]5 = [—-19]5 = 
[2]s = [7]s = [—3]s = [12]s =[—8]s = [17]s =[—13]s = [22]5 = [—18]5 = 
[3]s = [8]s = [—2]5 = [13]5 =[—7]5 = [18]5 =[—12]5 = [23]5 = [—-17]5 = 
[4]s = [9]s =[—1]s = [14]5 =[—6]s = [19]5 = [—11]5 = [24]5 = [—16]5 
= | 
m Examples 
Using the a = b (mod n) notation, we have 17 = —13 (mod 6), 17 = —13 (mod 15), 
17 & —13 (mod 20), 17 € —13 (mod 8). 
= Hi 


At this point, for each n > 1, we know that Z, is a set consisting of n elements. However, in 
order to say that Z,, is a commutative ring, we need to define addition and multiplication 

and then check that they satisfy the eight properties required of commutative rings. In 
mathematics, the simplest way to do something is often the right way to do it. That is the case 
when it comes to defining addition and multiplication in Z,. Given congruence classes [a], 
and [b],, the simplest way to define addition and multiplication would be to add and multiply 
the names of the classes. As it turns out, that is exactly what we do. 


Theorem 7.2. For every integer n > 1, we define addition and multiplication in Zy as 
[dln + [bn =[a+5]n 
and 
[a]n-[bIn =[a-O)n, 


for alla, b € Z. With this addition and multiplication, Zy is a commutative ring. 
Examples 


[2]5+[4]5 =[2+4]5=[6]5 and [6]5=[l]s5, thus [2]5+[4]5 =[lls, 
[2]5-[4]5 =[2-4]5 =[8]5 and [8]s5=[3]s5, thus [2]s5-[4]5 = [3]s, 


www-.elsevierdirect.com 


230 Chapter 7 


[8}io+[Shio = [8+ 5]io=[13hi0 and [13]}i9 =[3hio, thus [8]io+[5]io = Bhio, 
[8]io-[5lio =[8-5]i0 =[40]i0 and [40]i9 =[O]io, thus [8]i0-[5]i0 = [O]i0, 
(7]a1 + [6]41 = (7+ 6]41 = [13] 41, 

[7]a1 -[6]41 =[7-6]41 =[42]41 and [42]4;=[1]41, thus [7]a)-[6]4) =[1]a1. 


Shortly, we will prove Theorem 7.2. In that proof we will show that Z,, has identity elements 
for both addition and multiplication, and we will also find the additive inverse for each 
equivalence class. Before reading the proof of Theorem 7.2, you should review the previous 
examples and then try to anticipate which elements of Z,, are the identity elements and also 
what the additive inverse of each class will be. 


In several earlier chapters, we dealt with adding and multiplying equivalence classes. In those 
cases, it was important to show that addition and multiplication were well defined. When 
dealing with Z,, each equivalence class has an infinite number of different names. We will 
need to show that the choice of a name for an equivalence class does not affect the answer in 
addition and multiplication problems. The good news is that once we have succeeded in 
proving that addition and multiplication in Z, are well defined, it will be very easy to verify 
that Z,, is a commutative ring. 


m Examples 
In Zs, [2]5 =[17]5 and [3]s5 = [—7]5. Observe that 
[21s + [3]s = [5]s=[0]s and [17]5+[—7]s =[10]s = [0]s, 
[2]5-[3]s =[6]5 =[1]s and [17]5-[—7]5 = [—119]s = [1]s. 
As a result, 
[2]5+[3]s =[17]s+[-7]5 and [2]5-[3]s =[17]s-[—7]s. 


Thus, in this example, changing the names of the equivalence classes did not change the 
answer when adding and multiplying. 


For another example, in Zg we have [5]g = [—11]g and [7]s = [39]g. Note that 


[5]g +[7]g = [12]3 = [4]g and [—11]g+[39]g = [28]g = [4l]g, 
[5]g-(7]s = [35]g =[3]g and [—11]g-[39]g3 = [—429]g = [3]g. 


www-.elsevierdirect.com 


The Integers Modulon 231 


As a result, 
[S]g +[7]g =[—11]g+139]g and [5S]g-[7]g =[—11]s - [39]s. 


Once again, changing the names of the equivalence classes did not change the answer 
when adding and multiplying. 


Proof of Theorem 7.2. As just indicated, the main part of this proof will be showing that 
addition and multiplication in Z, are well defined. To this end, suppose [a], = [b], and 
[e]n = [d]n; we then need to show that [a], +[c]n = [D]n +[d]n and [a]n - [Cla = [D]n - (dn. 


By the definition of addition and multiplication in Z,, we have 
[an +leln =[a+cln, [bln + ldn =[b+4]n, 
[a]n-[e]n =[a@-c]n and [b]n-[d]n =[b-d]n. 
As a result, in order to verify that 
[a|n +1¢]n = [bln +[dIn and [a]n+[e]n =[b]n-[@]n, 
we must show that 
[at+c]n=[b+d], and [a-c],=[b-d]n. 


However, using the definition of the relation ~,,, this means that we need to show that n 
divides (a+ c) — (b+ d) and n also divides a-c—b-d. 


Since [a],, = [Db], and [c], = [d],, we know that n divides both a — b and c — d. Thus, 
(1) a—b=n-m, and c—d=n-my, 
for some m,, m2 € Z. As a result, 
(a+c)—(b+d) = (a—b) + (c—@ =n-m +n-mz=n(m +mz). 
Therefore, n divides (a+ c) — (b+ d) and addition in Z, is indeed well defined. 
To see that multiplication is well defined, we first refer back to (1) and rewrite it as 
a=b+n-m, and c=d-+n-myp. 

Multiplying the preceding equations, we have 

a-c=(b+n-m)(d+n-m2)=b-d+n(b- m2 +d-m,+n-m,-mp2). 
As a result, 


a-c—b-d=n(b-m2+d-m,+n-m,-mp), 


232 Chapter 7 


which immediately tells us that n divides a-c — b-d. Thus, multiplication in Z, is also well 
defined. 


Having shown that addition and multiplication in Z, are well defined, showing that Z,, is a 
commutative ring will now be quite routine. Among the properties that we need to show are 
satisfied in Z, are two associative laws, two commutative laws, and both parts of the 
distributive law. Fortunately, it is not hard to see that all of these properties are inherited by Z,, 
directly from Z. For example, given classes [a], [b],, and [c],, we have 


(la|n + [OIn) +leln = (a+ bln) +leln = (a+b) +c], = [at (b+e)]n = 
(2) laln + lb+c]n = [a]n + (P]n +[e]n). 


Observe that in (2), the equality [((a+b)+c], = [a+ (b+c)], follows directly from the 
associativity of addition in Z. All the other equalities in (2) follow from the definition of 
addition in Z,,. Thus addition in Z,, is associative. 


Similarly, using one part of the distributive law in Z, we have 
(La]n + [b]n)lcln = [a+ bn + Leln = (a+ b)eln = [a-c+b- ln = [a+ c]n + [b+ C]n = 
La]n- [ln + [b]n-[eln- 


Note that the equality [(a+ b)c], = [a-c+)-c], follows from one part of the distributive law 
in Z and the other equalities above follow from the definition of addition and multiplication in 
Zn. Thus, Z, satisfies one part of the distributive law. 


It is no harder to show that Z,, also inherits the associativity of multiplication, both 
commutative laws, and the other part of the distributive law from Z. In light of this, we will 
omit the details. 


Also, observe that 
[an + [0], = la+O]n = [@]n = [0+ 4] = [0]n +[a]n 
and 
[a)n In = [a Un = [an = [1 adn = Wn Cah. 


Thus, [0], and [1], are, respectively, the additive and multiplicative identity elements of Z,. 
Finally, 


[a]n +[—a@]n = [a— a], = [0], = [—a+a]n = [—@]n +[aln, 


so [—a], is the additive inverse [a],. As a result, Z, does indeed satisfy all the properties of a 
commutative ring. 


The Integers Modulon 233 


7.2 Zero Divisors and Invertible Elements 


It is interesting to notice how different Z2, Z3, Z4, Zs, ... are from our earlier examples of 
commutative rings like Z, Q, R, C, Z[x], and Q[x]. The most glaring difference is that each 
Zy is a finite set, whereas all of our previous examples were infinite. In addition, observe that 
in Zjo0, 


[20]fo9 = [20] 100 - [20] 100 = [400] 100 = [O] 100. 


Thus, in Z1o9, we can multiply a nonzero element by itself and obtain 0. Certainly, nothing like 
this happened in our earlier examples of commutative rings. To better understand some of the 
differences between Z,, and our earlier examples, we need to introduce some terminology. 


Many of the terms and concepts we are about to introduce do not require that multiplication be 
commutative. Recall that if a set satisfies all the properties of a commutative ring with the 
possible exception that multiplication may not be commutative, we simply refer to it as a ring. 
Thus, multiplication in a ring may be commutative, but it is not required to be commutative. 
In particular, every commutative ring is a ring, but not every ring is a commutative ring. 
If you are not familiar with the addition and multiplication of square matrices, then you may 
have never seen an example of a ring that is not a commutative ring. To remedy this, we 
present the following. 


m Example 


Let R be the ring consisting of all polynomials in the variables x and y with coefficients in 
IR with the special condition that yx = 0. Therefore, in R, multiplication is identical to 
the multiplication of polynomials in two variable calculus with the one huge difference. 
Namely, whenever we see a term of the form yx, instead of letting yx equal xy, we replace 
yx by 0. For example, in R, if we use the distributive law and the fact that yx = 0, we have 
the following: 


(34+2x-—7y+ Ax*y — 8xy*) (Sx+y)= 
15x + 10x” — 35 yx + 20x” yx — 40xy4x + 3y + 2xy—Ty* +42’ y? — 8xy? = 
15x + 10x*+3y + 2xy—7Ty* +4x7y? —8xy. 
Also note that in R, the order of elements in a multiplication problem can affect whether 


or not the product is 0 as x- y #0 and y-x =0. Furthermore, note that the element xy 
has the interesting property that it is nonzero, yet its square is 0 as 


(xy)? = @y) Oy) =xQx)y =x-0-y =0. 


www.elsevierdirect.com 


234 Chapter 7 


Definition 7.3. Let R be a ring. 


(a) A nonzero element r € R is called a zero divisor if there is a nonzero element s € R such 
that either rs = 0 or sr = 0. 


(b) An element r € R is said to be invertible if there exists some s € R such that rs = 1 = sr. 


In this case, we write s = r7!. 


m Examples 


In the previous example, the fact that yx = 0 tells us that both x and y are zero divisors. 
Furthermore, since 


(d+xy)0 —xy) =1+xy-—xy-—xyxy=1 
and 
(-—xy)d+xy) =1—xy+xy—xyxy = 1, 


we can see that both 1+ xy and | — xy are invertible in R as they are each the 
multiplicative inverse of the other. 


Before this chapter, almost all of our examples of rings did not have zero divisors. However, 
depending on the value of n, Z, may have many zero divisors. 


m= Examples 
In Z19, we have 
[2}10- [Shio = [4]10- [5]10 = [6hio - [S}io = [8]10 - 5]10 = [9hi0. 
Thus, in Zi, [2]10, [4]10; [5]10, [6]10, [8]10 are all zero divisors. Similarly, in Zg, we have 
[2]s -[4]g = [6] - [4]g = [0]. 


Therefore, in Zg, [2]g, [4]g, [6]g are all zero divisors. 
L | 


Next, we observe that an element of a ring cannot be both invertible and a zero divisor. 


Lemma 7.4. Jf Ris aring andr € R, then r cannot be both invertible and a zero divisor. 


www-.elsevierdirect.com 


The Integers Modulon 235 


Proof. Lemma 5.14(c) asserts that multiplying by 0 in a commutative ring always gives 0 as 
the answer. However, as we remarked at the time, the proof of that fact did not require that 
multiplication be commutative. Now suppose r € R such that r has a multiplicative inverse 
r—', We need to show that in this case, r cannot also be a zero divisor. 


To this end, if ¢ € R such that tr = 0, then Lemma 5.14(c) along with the associativity of 
multiplication, tells us that 


0=0-r!=(mMr' =tr-r') =t-1=t. 
On the other hand, if rf = 0, an almost identical argument shows that 
0=r!.0=r hr) =(r!-nt=1-t=t. 


As a result, if either tr = 0 or rt = 0, then t = 0. Since there does not exist a nonzero element tf 
such that either tr = 0 or rt = 0, r is not a zero divisor. Hence, no element of R can 
simultaneously be invertible and a zero divisor. 


Whereas Lemma 7.4 indicates that an element of a ring cannot simultaneously be invertible 
and a zero divisor, we should be aware that many elements of a ring are neither invertible nor a 
zero divisor. For example, in Z, there are no zero divisors and the only invertible elements are 
1 and —1. In Q[x], once again there are no zero divisors, and the only invertible elements are 
the nonzero polynomials of degree 0. In particular, every polynomial in Q[x] of degree at least 
1 is neither invertible nor a zero divisor. However, as we examine the following examples, we 
will see that the situation is quite different in Z,. 


m Examples 


In an earlier example, we saw that [2]10, [4]10, [Slio, [6]i10, [810 are all zero divisors in Zo. 
However, we also have 


[110 - (io = [310° [7)10 = [910 - [910 = Dio. 


Since multiplication in Zio is commutative, we see that [1]10, [3]10, [7]10, [9]10 are all 
invertible in Zig. Thus every nonzero element in Zj9 is either invertible or a zero divisor. 
Returning to Zg, earlier we saw that [2]g, [4]s, [6]g are all zero divisors. On the other 
hand, we have 


[1]s-[1]s = [3]s - [3]g = [5]s - [51g = [718 - (71s = (11s. 


Therefore, [1], [3]s, [5]g, [7]g are all invertible in Zg. As a result, it is also the case in Zg 
that every nonzero element is either invertible or a zero divisor. 


www.elsevierdirect.com 


236 Chapter 7 


Based on the preceding examples, it is reasonable to wonder if every nonzero element of Z,, 
must be either invertible or a zero divisor. In fact, not only is this true for Z, but it is also true 
for all rings that are finite. Interestingly enough, it is no harder to prove this fact for all finite 
rings than it is to prove it solely for rings of the form Z,. 


Theorem 7.5. [faring R is a finite set, then every nonzero element of R is either invertible or 
a zero divisor. Furthermore, if a € R is invertible, then there is a positive integer m such that 
a” = 1, 


Proof. Suppose a € R such that there exists a positive integer m such that a” = 1. Since 
m—1>0, it follows that a’”—! € R, and we immediately see that 


a-a"™-! =a" =1=a" =a"! -a. 


Thus, a”! is the multiplicative inverse of a, and so a is indeed invertible. 


As a result, in order to prove our result, it suffices to show that if a € R is not a zero divisor 
and if R is finite, then there exists a positive integer m such that a” = 1. We begin by 
considering the sequence 


a Ge Sa ua Ae a ee Ga atk 
Since R is a finite set and every element of the sequence belongs to R, there must be elements 
of R that appear more than once in the sequence. As a result, we can let k be the smallest 
positive integer such that there exists a larger integer j such that a/ =a‘. In particular, a‘ is the 
first term of the sequence that reappears later in the sequence. 


Since a/ = a‘ and j>k=>=1, we now have 
(3) a(al! —a!) =a/-a = 0. 


There are now two possibilities, either a/—! — a*—! £0 or a/~! — a‘! = 0. If the first 
possibility occurs, then (3) tells us that we have multiplied a by the nonzero element 

k—1 and obtained 0 as an answer. However, this contradicts our assumption that a is 
not a zero divisor. 


al-1_a 


k- 


As a result, it must be the case that a/~! — a‘-! = 0, so 


qi! k-1 


However, recall that k had the property that it was the smallest positive integer such that a“ 
reappears later in the preceding sequence. If k — 1 was a positive integer, then the fact that 

‘1 would tell us that a‘! is a term in the sequence that appears earlier than a‘, yet 
reappears later in the sequence. This would contradict the fact that a‘ is the first element of the 


al-!'=a 


The Integers Modulon 237 


sequence that eventually reappears. In light of this, it must be the case that k — 1 is nota 
positive integer. Since k is a positive integer and k — 1 is not a positive integer, it must be the 
case that k = 1. If we now return to (3) and use the fact that k = 1, we now have 


j k 


O=a’—a : 


=ai—a' =a-a)"'!-a-1=a(ai'!-1). 


Since a is not a zero divisor, this tells us that a/~! — 1 = 0, so a/~! = 1. However, since j > k, 
we know that j — 1 > 1. Therefore, if we let m = j — 1, we now have 


for some m > 1, thereby proving our result. 


Specializing Theorem 7.5 to rings of the form Z,, we now have 


Corollary 7.6. In Z,, every nonzero element is either invertible or is a zero divisor. 


m Examples 
In Z3, [1]3 and [2]3 are the only invertible elements and 
[23 =(]s, thus [2];' = [2]; = (2h. 
In Z4, [1]4 and [3]4 are the only invertible elements and 
[3]4¢=(l4, thus [3],' = [3], =[3la. 


In Zs, [1]s, [2]s, [3]s, [4]5 are the only invertible elements and 


[213 =[1]s, thus [2]5' = [2] = [27]5 =[3]s, 
[31 =[1]s, thus [3]5’ = [3]3 = [3°ls = [2]s, 
[4]3=[1]s, thus [4]5' = [4]s = [4]s. 


In Ze, [1] and [5]6 are the only invertible elements and 


[5]g=[16, thus [5]¢' = [5]} = [5]o. 


As we know, every ring is a group under addition. However, as the next proposition indicates, 
the invertible elements of a ring form a group under multiplication. Therefore, not only do the 
rings Z, provide us with new examples of rings, but they also provide us with new examples 
of groups. 


www.elsevierdirect.com 


238 Chapter 7 


Proposition 7.7. If R is a ring, then the invertible elements of R, denoted as U(R), are a 
group under multiplication. 


Proof. The first thing we need to check is that U(R) is closed under multiplication. If 
a, b € U(R), then a, b, a_', b~! are all elements of R. As a result, ab, b~!a~! € R. Next, 
observe that 


(ab)(b-!a~!) = a(b-b")a~! =a-1-a7! =a-a'! =1 
and 
(b-!a7!)(ab) = b (a! -a)b=b7!-1-b=b7'-b=1. 


The preceding equations reveal that b~'a~! is the multiplicative inverse of ab. Thus, ab is an 
element of R, which has a multiplicative inverse in R. Hence, ab € U(R), so U(R) is indeed 
closed under multiplication. 


Since multiplication in R is associative and U(R) is a subset of R, it is clear that multiplication 
in U(R) is also associative. If we let 1 denote the multiplicative identity of R, then 1-a=a= 
a-1, for all a € R. Therefore, in order to conclude that | is the identity element of UCR), we 
only need to show that 1 € U(R). However, since | is its own multiplicative inverse, it is 
certainly the case that 1 € U(R). Finally, if a € U(R), then the equation 


indicates that not only is a~! the inverse of a but also that a is the inverse of a~!. Thus, a7! is 


an element of R whose multiplicative inverse belongs to R. Hence, a~! € U(R), and we have 
shown that every element of U(R) has an inverse in U(R). Therefore, U(R) satisfies all the 
properties of a group. 


Although the proof of Proposition 7.7 is very abstract, we can represent one of the main ideas 
of the proof very concretely. If a and b are both invertible, then, at first glance, one might 
guess that the inverse of ab is a~'!b—~!. However, as we saw in the preceding proof, b~!a~ 
the inverse of ab. Therefore, in order to invert ab, not only do we need to invert both a and b to 
obtain a~! and b~!, but then we also need to reverse the order in which they appear to obtain 
b~'a~'. Here are some easy and concrete examples to help you understand why we need to 
reverse the order when finding the inverse of a product. 


lis 


Examples 


Suppose you are undressing. Earlier in the day, you had put on your socks and then your 
shoes. To undress, you must undo, or invert, this act. But note that when undressing, you 
must first take off your shoes and then take off your socks. Thus, whereas when dressing 


2. 


The Integers Modulon 239 


you deal first with the socks and then with the shoes, when inverting this procedure, you 
reverse the order and deal first with the shoes and then with the socks. 


Similarly, when going to work in the morning, you might put on your coat, leave the 
house, and enter the car. At the end of the day, you invert this procedure, and when you 
get home, you leave the car, enter the house, and take off your coat. Thus, in the morning 
you deal with, in order, coat-house-car. But in the evening, each individual action is 
inverted, and the order is reversed as you deal with, in order, car-house-coat. 


m Examples 


In the following examples, for various rings R, we examine the group of invertible 
elements U(R). 
1. 


U(Z) = {1, -1}. 


U(Q) is the set of all nonzero elements of Q. In fact, observe that if R is a 
commutative ring, then R is a field if and only if U(R) is equal to the entire set of 
nonzero elements of R. 


U(Z[x]) = {1, —1}, as no polynomial of degree at least one has an inverse in Z[x]. 


U(R[x]) is the set of nonzero real numbers, as no polynomial of degree at least one 
has an inverse in R[x]. 


U(Zz) = {[11}- 

U(Zs) = {[1]s, [2]3}. 

U(Za) = {[1]4, [3]4}. 

U(Zs) = {[1]s, [2]s, [3]s, [4]s}- 
U(Ze) = {[1]e, [5]o}. 

U(Zs) = {[1]s, [3]s, [5]s. [7]s}- 


U(Zio) = {[hho, [3)hio, (7)i0, [9)io}- 
| 


Examining groups of the form U(Z,,) provides us with many examples of finite groups. In light 
of this, two question regarding U(Z,,) and the invertible elements of Z,, that you may have are 


1. For eachn > 1, is there an easy way to determine which of the elements of U(Z,,) are 
invertible and which are zero divisors? 


2. Foreachn > 1, is there an easy way to determine the size of the set U(Z,)? 


www-.elsevierdirect.com 


240 Chapter 7 


Before answering the first of our two questions, let us look back at some of our previous 
examples. In Ze, the invertible elements are [1]¢, [5]¢, and the zero divisors are [2]6, [3]e, [4]e- 
In Zo the invertible elements are [1]10, [3]10, [7]io, [9]10, and the zero divisors are 

[2]10, (410, [S)io, L6]i0, [8]io. It is usually a good idea to look for a pattern before trying to 
prove a general result. When we look at Ze, we can observe that the classes [a] that were 
invertible were precisely those where a was relatively prime to 6. Similarly, when looking at 
Z19, we can see that the classes [a] that were invertible were precisely those where a was 
relatively prime to 10. These observations lead us to 


Theorem 7.8. In Zy, the class [a], is invertible if and only if a and n are relatively prime. 


Proof. In one direction, if a and n are relatively prime, then the greatest common divisor of a 
and n is 1, and Theorem 3.8 tells us that there exist integers s, t such that 


S-att-n=l. 


This equation is equivalent to the equation s-a — 1 = —t-n. Therefore, the difference 
between s-a and | is divisible by n, which tells us that s-a~, 1. As a result, [s- a], = [1]n. 
Using the definition of multiplication in Z,, we now know that 


[sln Lan =U ]n.- 


Since multiplication in Z, is commutative, we also know that [a], - [5], = [1],. Therefore, [s], 
is the multiplicative inverse of [a], and so, [a], is invertible. 


In the other direction, if [a], is invertible, then there exists some u € Z such that 
[Jn - [ale = (Un = [a]n - [edn 


Since [uv], - [a], = [u-a]n, we know that u-a~, 1. As aresult, n divides u-a— 1, which 
immediately tells us that there exists some v € Z such that u-a — 1=v-n. But this equation 
is equivalent to the equation 


u-at+(—v)-n=1. 


However, having written 1 as a multiple of a plus a multiple of n, Theorem 3.8 tells us that we 
have indeed shown that a and n are relatively prime. 


Recall that every field is a commutative ring, but not every commutative ring is a field. In 
particular, for those n such that Z,, has zero divisors, we know that Z, is a commutative ring 
that is not a field. This then raises the question, for which values of n is Z, a field? If you look 
back at the examples in this chapter, you may already have a good idea what the answer to this 
question is. In fact, the answer to this question follows directly from Theorem 7.8. 


The Integers Modulon 241 
Corollary 7.9. Z,, is a field if and only if n is a prime number. 


Proof. In Z,, the nonzero equivalence classes are 
[Wns (2]n, (3)n,.--, [4 -—2])n, la — Un. 


In one direction, if n is prime then every one of the integers 1,2,3,...,2 —2,n—1 is 
relatively prime to n. Therefore, Theorem 7.8 asserts that every nonzero class in Z, has a 
multiplicative inverse. Thus, Z,, is a field. 


In the other direction, if 1 is not prime, then there exists a positive integer a such that a divides 
nand 1 <a <n. Asa result, [a], is a nonzero equivalence class in Z, where a and n are not 
relatively prime. Thus, Theorem 7.8 tells us that [a], is not invertible in Z,. However, since 
Zn Now contains a nonzero element that is not invertible, Z,, cannot be a field. 


Exercises for Sections 7.1 and 7.2 


1. In Zg, compute [3]g +[7]s, [3]s-[7]s, [41s + [6]g, and [4]g - [6]s. 


2. In Zg, compute [5]g + [4]g, [5]s - [4]g, [6]3 + [2], and [6] - [2]s. 
3. In Zp), compute [14]o; + [12]or, [14]a1 - [12]o1, [18]o1 + [13]a1, and [18]21 - [13]21. 
4. In Zp, compute [9]o1 +[18]21, [9]o1 -[18]21, [621 + [14]o1, and [6]o1 -[14]21. 
5. In Ze, compute the square of each nonzero element. 
6. In Zg, compute the square of each nonzero element. 
7. In Zs, compute the cube of each nonzero element. 
8. In Zs, compute the fourth power of each nonzero element. 
9. In Zjo, list all the invertible elements and find the inverse of each element. 
10. In Zy4, list all the invertible elements and find the inverse of each element. 
11. In Zs, list all the invertible elements and find the inverse of each element. 
12. In Zp, list all the invertible elements and find the inverse of each element. 
13. If today is Monday, what day of the week will it be in 100 days? 
14. If today is Tuesday, what day of the week will it be in 1000 days? 
15. If today is Wednesday, what day of the week will it be in 1,000,000 days? 
16. Achild recites the alphabet over and over again. If she stops after reciting 187 letters, 


what was the last letter she said? 


242 Chapter 7 


17. Letn > 2 be an integer and let o : Z — Z,, be the function defined as o(a) = [da], for all 
aeZ. 
(a) Show that o(a+b) = o(a) + o0(b), for all a,b € Z. 


(b) Show that o(a-b) = o(a)-o(b), for all a, b € Z. 
(c) Show that o(a; +d. +---d,) =0(a,) +0(a2) +--+ O(a), for all m € N and a; € Z. 
(d) Show that o(a, -d2--+Gm) = 0(a,)-o(a2)---o(ay), for all m € N and a; € Z. 


18. Let f(x) € Z[x] and let n > 2 be an integer. Use exercise 17 to prove that if a, b € Z such 
that [a]n = [b]n, then [ f(a) ]n = [f(B)In- 


For exercise 19, please read the following: 


Mathematicians have long searched for systematic ways of producing examples of prime 
numbers. This search leads naturally to the following question: Does there exists a polynomial 
f(x) with integer coefficients of degree at least 1 such that f(1) is prime, for every n € N? 

If such a polynomial existed, then the infinite sequence 


FO), f2), FB), fA), FS), «++: 


would consist entirely of prime numbers. However, in the next exercise, we will show that no 
such polynomial can exist. 


Even though we will prove that there does not exist a nonconstant polynomial that produces 
only values which are prime, there are some interesting results worth mentioning that deal 
with prime values of polynomials. Consider the polynomial 


g(x) =x? —x4+41. 


It will follow from our next exercise that there is some natural number n such that g(n) is not 
prime. In fact, it is not hard to find such a natural number. Observe that if we let x = 41, then 
2(41) = 417 —414 41 = 41? is not prime. However, the interesting thing to note about g(x) 
is that 


e(1) =41, 9(2) = 43, 9(3) =47,..., 2(39) = 1,523, 2(40) = 1601 


are all prime. Thus, whereas no nonconstant polynomial produces only prime values, we have 
exhibited a quadratic whose first 40 values are all prime. In light of this we can ask, is there a 
polynomial of small degree whose first 100 values are all prime? Or is there a polynomial of 
small degree whose first 1,000,000,000 values are all prime? 


In 2004, B. Green and T. Tao proved a remarkable result in answering these questions. 
They proved that for any natural number JN, there exists a linear polynomial h(x) such that 


The Integers Modulon 243 


its first NV values 


h(1), bh), hGB), h(4),...,h(N—-1), AW) 


are all prime. 


19. 


20. 


21. 
22. 
23. 


24. 
25. 
26. 


27. 


28. 


29. 


30. 


Let f(x) € Z[x] have degree at least 1 and suppose a € Z such that f(a) = p, where pisa 

prime number. 

(a) Use exercise 18 to show that if b € Z such that [b], = [a]p, then f(b) is a multiple 
of p. 

(b) Use part (a) to show that there exists some positive integer b € [a], such that f(b) is 
not a prime number. 

(c) Conclude that there cannot exist a polynomial in Z[x] of degree at least 1 with the 
property that f(7) is prime, for every n € N. 


Let f(x) € Z[x] and let n > 2 be an integer such than [ f(a) ]n 4 [O]n, for 
a=0,1,2...,n—2,n—1. Prove that f(x) has no roots in the integers. (Hint: Look at 
exercise 18.) 


Use n = 2 in exercise 20 to show that 4x° — 17x* + 3x? — 8x2 + 16x —9 has no roots in Z. 
Use n = 2 in exercise 20 to show that 9x8 — 5x° + 14x? — 36x +27 has no roots in Z. 


Use n = 3 in exercise 20 to show that 10x° — 7x* — 4x3 + 11x? — 21x +85 has no roots 
in Z. 


Use n = 5 in exercise 20 to show that 6x® — 3x° + 4x* + 8x — 72 has no roots in Z. 


Use n = 4 in exercise 20 to show that 3x* — 9x3 + 8x? + 17x + 63 has no roots in Z. 


In the previous exercise, you used n = 4 and exercise 20 to show that 

3x4 — 9x3 + 8x? + 17x +63 has no roots in Z. Show that if you had used n = 2 and 
exercise 20 to examine 3x* — 9x3 + 8x? + 17x +63, then you do not obtain enough 
information to determine whether 3x* — 9x? + 8x? + 17x +63 has any roots in Z. 


Show that using n = 3 and n = 5 in exercise 20 also do not give you enough information 
to determine whether 3x4 — 9x? + 8x” + 17x +63 has any roots in Z. 


Find an integer n > 2 such that you can use exercise 20 to show that 4x7 — 5x° + 11x°— 
6x4 + 9x3 — 16x” — 23x + 105 has no roots in Z. 


Find an integer n > 2 such that you can use exercise 20 to show that 2x* — 7x3 +3x? + 
11x+ 70 has no roots in Z. 


Let f(x) € Z[x] and let n > 2 be an integer such that [ f(a)], = [0],, fora=0,1,2..., 
n—2,n—1. Prove that for every m € Z, f(m) is a multiple of n. (Hint: Look at 
exercise 18.) 


244 Chapter 7 


31. 


32. 


33. 


34. 


35. 
36. 
37. 


Use exercise 30 to show that 7m!'! —5m® + 6m> + 4m? + 22m — 360 is divisible by 2, for 
every m € Z. 


G 


se exercise 30 to show that 8m> —4m* — 17m? + 10m? — 45m +90 is divisible by 3, for 
very m € Z. 


i) 


Use exercise 30 to show that 6m!2 + 3m!° — 4m? +7m® — 16m* + 9m — 100 is divisible 
by 5, for every m € Z. 


Use exercise 30 to show that 7m> — 4m* + 10m? — 23m? — 14m + 42 is divisible by both 
2 and 3 and is therefore divisible by 6, for every m € Z. 


If a is an odd integer, show that [a7]g = [1ls. 
For any m € N, examine [11” — 1];9 in Zjo, and then prove that 11” — 1 is divisible by 10. 


For any m € N examine [4” + 2]2 in Zz and [4” + 2]3 in Z3. Then prove that 4” + 2 is 
divisible by 6. 


The goal of exercises 38-53 is to develop divisibility tests. By this we mean that given a 
positive integer NV, we would like to develop shortcuts to determine if N is divisible by various 
integers. We will begin with relatively easy tests to determine if an integer is divisible by 2, 5, 
or powers of 10. Then we will develop tests to check divisibility by 3 and 9. Finally, we will 
develop more sophisticated tests for divisibility by 7 and 11. 


38. 
39. 
40. 


41. 


Show that a positive integer is divisible by 2 if and only if its one’s digit is divisible by 2. 
Show that a positive integer is divisible by 5 if and only if its one’s digit is divisible by 5. 
(a) Show that a positive integer is divisible by 10 if and only if the one’s digit is 0. 


(b) Generalize (a) and show that if m € N, then a positive integer is divisible by 10” if 
and only if its rightmost m digits are all 0. 


Given an integer n, let S(m) denote the sum of its digits. If S(7) > 10, we can take the 
sum of the digits of S(7). We can continue this process until we get an answer that is less 
than 10, and we denote this as S(n). For example, if n = 6,574, 280, 357, then S(n) = 47 
and S$ (n) = 2. In order to do the next exercise, it will be useful to recall that if the base 10 
expansion of the positive integer 1 iS G,dm—1---a1do, where 0 < a; < 9, then 

S(n) = Gm +am—1 +--+ tay +a and n = 10am +10" !am_1 +--+ + 10a; +40. 


(a) IfneéN, show that [S(n)]9 = [n]o. 
(b) Ifn é€N, show that [S(7)]9 = [n]o. 
(c) Use (b) to prove that n € N is divisible by 3 if and only if S (n) is divisible by 3. 
(d) Use (b) to prove that n € N is divisible by 9 if and only if S (n) is divisible by 9. 


42. 


43. 


44. 


45. 


46. 


47. 


The Integers Modulon 245 


For each value of n, find S(n) and S (n). Then determine if n is divisible by 3 or 9. 
(a) n=4,284 


(b) n=51, 782 

(c) n=483, 609 

For each value of n, find S(n) and S (n). Then determine if 7 is divisible by 3 or 9. 
(a) n= 743,928, 054 

(b) n= 687,012, 867 

(c) n=519, 065, 432 

(a) Show that if m is an odd positive integer, then 10” + 1 is divisible by 11. 

(b) Show that if m is an even positive integer, then 10” — 1 is divisible by 11. 
(c) Conclude that 10” — (—1)” is divisible by 11, for all m € N. 


If the base 10 expansion of the positive integer n is d,@n—1-++a,a9, where 0 < a; <9, 
then we define B(n) = ap — ay + a2 — a3 +---+(—1)”a,» and can then define the 
alternating sum of the digits A(n) as A(n) = | B(n)|. If A(z) => 10, we can also take the 
alternating sum of the digits of A(m). We can continue this process until we get an 
answer that is less than 10, and we denote this as A(n). 


For part (a) of this exercise, you may want to refer to exercise 44. 
(a) Letn € N; show that [B(m)]q1 = [1]11. 


(b) Letn €N; show that [A(7)]q; = [0]11 if and only if [7]; = [0]11. 
(c) Letn € N; show that [A(n) | = [0], if and only if [7]; = [0]11. 
(d) Letn €N; show that n is divisible by 11 if and only if A(n) is divisible by 11. 


For each value of n, find B(n), A(n), and A(n). Then determine if n is divisible by 11. 
(a) n=586 


(b) n= 486, 329 

(c) n= 695,794 

For each value of n, find B(n), A(n), and A(n). Then determine if n is divisible by 11. 
(a) n= 845, 009, 314, 764 

(b) n=714, 283, 749, 586 

(c) n= 283,716, 295, 417 


If the base 10 expansion of the positive integer n is d,@n—1-++a,do, where 0 < a; <9, 
then we define 


Vin) = dg + 3a1 + 2a2 — a3 — 3a4 — 245 + do + 3a7 + 2ag — do — 3aj9 — 244, +--- 


246 Chapter 7 


and can then define the twisted sum T(n) as T(n) =| V(n) |. If T(n) > 10, we can take the 
twisted sum of the digits of T(n). We can continue this process until we get an answer 
that is less than 10, and we denote this as T(n). In order to easily compute V(n) and T(n), 
you should keep in mind the sequence 


15:3,2,=1,3) $2,153) 2, 13,2, 1352). S135 20 ae 
as these are the numbers that appear before the digits of n when computing V(n) and 7(n). 


48. Prove the following: 
(a) IfmeN and [m]¢ = [0]g, then [10”]7 = [1] 


] 
(b) Ifme€N and [m]¢ = [1]o, then [10”]7 = [3]7 
(c) IfmeN and [m]o =[2]o, then [10”]7 = [2]7 
(d) Ifm€N and [m]¢ = [3]o, then [10”]7 = [6]7 
(e) Ifme€N and [m]o = [4]e, then [10”]7 = [4]7 
(f) Ifme€N and [m]¢ = [5]o, then [10”]7 = [5] 


49. For part (a) of this exercise, you may want to refer to exercise 48. 
(a) Letn €N; show that [V(n)]7 = [n]7. 


(b) Let n € N; show that [T(1)]7 = [0], if and only if [n]7 = [0]7. 
(c) Letn € N; show that [T(n)}7 = [0]7 if and only if [”]7 = [0]. 
(d) Let n € N; show that n is divisible by 7 if and only if T(n) is divisible by 7. 
50. For each value of n, find V(n), T(n), and T(n). Then determine if n is divisible by 7. 
(a) n=955 
(b) n=672 
(c) n= 864 
51. For each value of n, find V(n), T(n), and T(n). Then determine if n is divisible by 7. 
(a) n=562, 980, 542 
(b) n=398, 244,728 
(c) n= 809, 824, 778 
52. For each value of n, use the divisibility tests developed in these exercises to determine 
which of the primes 2, 3,5, 7, 11 divide n. 
(a) n=10,857 
(b) n= 13,706 
(c) n=545,445 


53. 


54. 


55. 


56. 


oT. 


58. 


59. 


60. 


61. 


62. 


63. 
64. 
65. 


The Integers Modulon 247 


For each value of n, use the divisibility tests developed in these exercises to determine 
which of the primes 2, 3,5, 7, 11 divide n. 
(a) n= 246, 642 


(b) n=99,015 
(c) n= 846,279 


If G is a finite group and e denotes the identity element of G, show that for each g € G, 
there is an integer n > 1 such that g” = e. (Hint: Think about the proof of Theorem 7.5.) 


If G is a finite group and e denotes the identity element of G, show that there is an 
integer N > 1 such that ee = e, for all g € G (Note: In exercise 54, the integer n 
depended upon the element g € G. However, in this exercise, the integer N works for 
all g €G.) 


Suppose R is a commutative ring with only a finite number of elements and no zero 
divisors. Show that R is a field. 


Let R be a ring and suppose r € R such that r* = 0. Show that 1 +r has a multiplicative 
inverse in R. 


Let R be a ring and suppose r € R and n € N such that r” = 0. Show that 1+7 has a 
multiplicative inverse in R. 


Let G be a group such that g” = e, for all g € G, where e denotes the identity element of 
G. Prove that G is commutative. (Note: You must prove that ab = ba, for all a, b € G.) 


Let R be a ring such that r* = r, for all r € R. Prove that R is commutative. (Note: You 
must prove that ab = ba, for all a,b € R.) 


Let R be a ring such that r> =r, forall r € U(R). Prove that the group U(R) is 
commutative. 


In this exercise, you will provide the details we omitted in the proof of Theorem 7.2 that 
Zn is a commutative ring. 
(a) Show that multiplication in Z, is associative. 


(b) Show that addition in Z, is commutative. 

(c) Show that multiplication in Z, is commutative. 

(d) Show that Z,, also satisfies the other part of the distributive law. 
In Zs, find all equivalence classes whose square is [1]s. 

In Z7, find all equivalence classes whose square is [1]7. 


In Z 1, find all equivalence classes whose square is [1]11. 


248 Chapter 7 


66. 
67. 


68. 


69. 


70. 


71. 


72. 
73. 
74. 


In Z)3, find all equivalence classes whose square is [1])3. 


If p > 2 is a prime number, show that [1], and [p— 1], are only two equivalences in Z, 
whose square is equal to [1]p. 


If p is a prime number, show that [(p — 1)!]p =[p— 1], in Zp. (Hint: You may want to 
refer to exercise 67.) 


Let g(x) € Z[x] have degree at least 2, and let p be a prime number such that 
(i) the leading coefficient of g(x) is not divisible by p, 


(ii) every other coefficient of g(x) is divisible by p, 


(iii) the constant term of g(x) is not divisible by p. 


(a) Show that if a € Z such that [a],  [0]p, then [g(a)], F [0]p. 
(b) Show that if b € Z such that [b],, = [0], then [g(d)] p # [0] pe 
(c) Conclude that g(x) has no roots in Z. 


Use exercise 69 with the prime p = 3 to show that 5x° + 9x+ — 30x? + 15x — 33 has no 
oots in Z. 


ot 


Use exercise 69 with the prime p = 2 to show that 9x7 — 42x° + 30x° — 22x37 + 16x +70 
has no roots in Z. 


se exercise 69 to show that 3x* + 10x? — 40x? + 25x — 20 has no roots in Z. 


U 
Use exercise 69 to show that 2x? — 6x° + 90x* — 21x? + 36x? — 57 has no roots in Z. 


(a) Show that the criterion in exercise 69 does not apply to either x* — 5x +6 or 
x? —5x+7. 


(b) Show that x? —5x+6 has roots in Z and x” —5x+7 has no roots in Z. 


(c) Conclude that if exercise 69 does not apply, then you need more information to 
determine if a polynomial has roots in Z. 


7.3 The Euler ¢ Function 


In Chapter 3, we remarked that prime numbers are the building blocks of the integers and 
various problems can be solved by understanding prime numbers. One of the problems that 
can be solved by exploiting properties of prime numbers is our question regarding the number 
of invertible elements in Z,,. In light of Theorem 7.8, this problem requires us to determine 
how many of the integers from | to n — 1 are relatively prime to n. To solve this, it will be 
useful to introduce some new terminology. 


The Integers Modulon 249 


Definition 7.10. [fn > 1 is an integer, we define $(n) to be the number of positive integers 
less than n that are relatively prime to n. The function @ is usually referred to as the Euler @ 
function. 


The first step in finding a formula for ¢ is to observe that if p is a prime number, then 
o(p) = p—1. The reason for this is that all of the p— 1 positive integers less than p are 
relatively prime to p. The next step in finding a general formula for @ will be to consider 
o(p™”), where p is prime and m €N. 


Lemma 7.11. If p is a prime number and m €N, then ¢(p”) = p™|(p—1). 


Intuition. Suppose we want to compute $(125) = @(53). The only way an integer can fail to 
be relatively prime to 125 is if it is a multiple of 5. The only positive integers less than 125 that 
are multiples of 5 are 


5=1-5, 10=2-5, 15=3-5, 20=4-5,..., 115 = 23-5, 120=24-5. 


Therefore, of the 124 = 5° — 1 positive integers less than 125, we see that 24 = 5? — 1 of them 
fail to be relative prime to 5°. That means that the number of those that are relatively prime to 
125 is 


124 = 94 = (5? = 1) = = a NSH SSE. 
The proof for the general case is virtually identical. 
Proof. There are p” — | positive integers less than p”. The only ones that are not relatively 


prime to p” are those that are multiples of p. If we list the positive integers that are less than 
p™ and divisible by p, we have 


1-p, 2:p, 3p, 4-p,..., (p™ | —2)-p, (p™ '!=1)-p. 


Therefore, p’—! — 1 of the positive integers less than p” are divisible by p, meaning that the 
remaining 


(p" —1)—(p"™* = 1) = p™ = p™! = p™ "(p—1) 


are relatively prime to p”. Thus, ¢(p”) = p'—!(p—1), as desired. 


Examples 


(256) = (28) = 27(2—1) = 27 = 128. Thus, the group U(Zas6) has 128 elements. 
(3125) = (5°) = 5*(5— 1) = 625-4 = 2500. Therefore, U(Z3125) is a group with 2500 
elements. 


250 Chapter 7 


Having found a formula for the special case where n = p”, we need to handle the case where 
n cannot be written as p’. One of the tools we will use to accomplish this is 


Lemma 7.12. Ifa,s,t €N, then a is relatively prime to s-t if and only if it is relatively prime 
to both s and t. 


Proof. In this proof, we will make repeated use of the fact that integers n, m are relatively 
prime if and only if there exist integers A, B such that A-n+ B-m = 1. In one direction, 
suppose a is relatively prime to both s and ¢t. Then there exist integers A, B, C, D such that 


(4) A-a+B-s=1 and C-a+D-t=1. 


Multiplying the second equation in (4) by B gives us BC-a+ BD-t = B. This equation 
enables us to substitute for B in the first equation in (4) to obtain 


A-a+(BC-a+BD.-t)-s=1. 
We can rewrite this equation as 
(A+ BCs)-a+(BD)-(s-t) =1. 


Having succeeded in writing 1 as a multiple of a plus a multiple of s-t, we see that a is 
relatively prime to s-t. 


In the other direction, suppose a and s-t are relatively prime. Therefore, we know there exist 
integers A, B such that 


A-a+B-(s-H)=1. 
We can rewrite this equation as 
A-a+ Bt-s=1=A-a+Bs-t. 


Having written | as a multiple of a plus a multiple of s, we see that a is relatively prime to s. 
The identical reasoning tells us that a is also relatively prime to t. 


Suppose we can factor n as n = s-t, and suppose we already know the values of @(s) and ¢(2). 
At first glance, it is not clear if this helps us in computing ¢(n). If we look back at some of the 
earlier examples, we have 


$(6) =2=1-2=(2)-63) and (10) =4=1-4=(2)- 66), 
yet we also have 


$4) =241-1=¢(2)-¢@2) and $(25)=2044-4=9G()- G5). 


The Integers Modulon 251 


Table 7.13: [fn =s-t, where s,t> 1, then we can list all the positive integers from | to n in a table 
consisting of s rows and t columns. As we move from left to right in any row, the numbers increase by 1. 
Then as we move down any of our columns, the numbers increase by t. Therefore, if a is the top number 
in a given column, then the s entries in the column would be a, t+a,2t+4,...;(s—2)t+a;(s—I)t+a. 


1 ; i a F ‘ t 
t+1 A : t+a , 2t 
(s—2)t+1 . : (s—2)t+a ; : (s — 1)t 
(s—1)t+1 : : (s—lt+a : : st 


Therefore, it is sometimes but not always the case that d(n) = $(s)- d(t). However, it turns out 
that when s and ¢ are relatively prime, ¢(n) is indeed equal to ¢(s)- d(t). Proving this fact will 
require visualizing n as the product s times t, which we do in the table above. 


Observe that computing #(n) now reduces to counting the number of entries in our table that 
are relatively prime to n. In light of Lemma 7.12, we need to determine how many of the 
entries are relatively prime to both s and t. To do so, we need to analyze each column in our 
table. 


Lemma 7.14. Ifa is relatively prime to t then every entry in the column consisting of 
a,t+a,2t+a,...,(s—2)t+a, (s—1)t+<a is relatively prime to t. On the other hand, if a and 
t are not relatively prime, then none of the entries of this column are relatively prime to t. 


Proof. Suppose A, B € Z such that 
A-at+B-t=c. 
Then, if i ¢ Z, we can rewrite the previous equation as 
A-(i-t+a)+(B—A-i)-t=c. 


Therefore, any number that can be written as a multiple of a plus a multiple of ¢ can also be 
written as a multiple of i-t+a plus a multiple of f. 


In the other direction, if C, D, i € Z such that 


C-(i-t+a)+D-t=d, 


252 Chapter 7 


then we can rewrite this as 
C-a+(D+C-:i)-t=d. 


Therefore, any number that can be written as a multiple of i-+-+ a plus a multiple of ¢ can also 
be written as a multiple of a plus a multiple of f. 


These observations combine to tell us that regardless of our choice of i € Z, the smallest 
positive integer that can be written as a multiple of i-t-+a plus a multiple of t remains the 
same. Theorem 3.8 now asserts that for every i € Z, the greatest common divisor of i-f+a 
and f is the same. As a result, if a and ¢ are relatively prime, then f is relatively prime to every 
number in the column in Table 7.13 with a on top. Similarly, if a and ¢ are not relatively prime, 
then f is not relatively prime to any of the numbers in the column with a on top. 


As we move along the first row of the table, only (4) of the entries are relatively prime to f. 
Therefore, Lemma 7.14 tells us that in our search for entries that are relatively prime to 

both s and t, we can limit our search to the ¢(4) columns where the top entry is relatively prime 
to t. Next, we analyze which entries in these ¢(f) columns are relatively prime to s. Recall that 
if s and ¢ are not relatively prime, then it need not be the case that #(s-t) = $(s)- @(#). In the 
next lemma, you will see where the assumption that s and ¢ are relatively prime comes into 


play. 


Lemma 7.15. Jf s and t are relatively prime, then the equivalence classes 


la];,[t+a];,[2t+al]s,...,[(s—2)t+a]s, [(s— 1)t+al]s 


are the s different classes belonging to Zs. 


Proof. It suffices to show that the s equivalence classes [a];, [t+ a]s, [2t+a]s,..., 
[(s —2)t+a]s, [((s— 1)t+ al]; are all different elements of Z,. To this end, let i, j be integers 
such that O <i, j<s—1. If 


[i-t+a];=[j-t+als, 


then s divides (-t+a)—(j-t+a) =t-(i— j). Since s | (t- i— j)) and gcd(s, t) = 1, we can 
apply Lemma 3.9 to assert that s divides i — j. However, since 0 < i, j < s—1, then the only 
way s can divide i — j is for i and j to be equal. This tells us that as 7 ranges through the 
numbers from 0 to s — 1, the equivalence classes of the form [i- t+ a]; are all different. But 
this is precisely what we needed to prove. 


The Integers Modulon 253 
m Examples 


Let us examine what Lemma 7.15 says if we let a = 3, s = 7, andt = 10. It asserts that the 
classes 


[3]7, (13), [23], [33]7, [43], [53]7,. [63]7 
are the seven elements of the set Z7. To see this, observe that 
[63]7 = [0]7, [43]7 =[1]7, [2317 =[2), [3h = (3), 
[53]7 = [4]7, [33]7 = [5h [13] = [6]. 


Note that whereas it is important that s be relatively prime to f, a can have factors in 
common with either s or f. In particular, if we instead let a = 5 in the preceding example, 
we see that 


[S}7, (157, [25)7, (357, [45]7, [55)7, [65)7 
are also the seven elements of the set Z7. In this case, 


[35]7 = [0]7, [1517 =O, [6517 =[2h, [45]7 = (3), 
[25]7 = [4]7, [57 = (57, [55]7 = [6]. 


We can now combine Lemmas 7.14 and 7.15 to obtain 
Lemma 7.16. [fn =s-t, where s,t > 1 and gcd(s, t) = 1, then o(n) = G(s) - (0). 


Proof. Recall that we need to determine the number of entries in Table 7.13 that are relatively 
prime to both s and t. Lemma 7.14 told us that only ¢(4) of our t columns contain numbers 
relatively prime to ¢f and every number in these ¢() columns is relatively prime to f. On the 
other hand, Lemma 7.15 tells us that regardless of which column we look at, the s entries in 
the column correspond to the s different equivalent classes in Z,. However, in Z, there are 
exactly f(s) invertible elements, and Theorem 7.8 tells us that an element [a], € Z, is 
invertible precisely if a and s are relatively prime. Therefore, every column contains exactly 
o(s) entries relatively prime to s. As a result, in each of the #(t) columns consisting entirely of 
entries relatively prime to f, there are #(s) entries that are also relatively prime to s. Thus, the 
number of entries in our table relatively prime to both s and t must be equal to ¢(s)- (A). 


Hence, $(1) = $(s)-¢(). 


In Lemma 7.11, we determined the formula for ¢(7) in the special case where n = p’””. 


Combining that result with Lemma 7.16, we can now determine the general formula for ¢(n). 


www-.elsevierdirect.com 


254 Chapter 7 


Theorem 7.17. [fn = pi" - po” +++ py is the prime factorization of n, where the primes p; 
are all different and each a; > 1, then 


b(n) = pi! (p1 — 1) « po "(po — 1) = Dn | (Pn — 1). 


Proof. Let T = {k € N | the formula for ¢ holds for all n € N with k different prime divisors}; 
it suffices to show that T = N, and we will proceed by Mathematical Induction. To show that 
1 € T, we need to verify that the formula for ¢ holds for all n € N with only one prime divisor. 
But this means that n is of the form p;“'!. However, in this special case, Lemma 7.11 tells us 
that the formula for ¢ is correct. Thus, 1 € T. 


Next, suppose 7 contains some positive integer k; we need to show that T also contains k+ 1. 
Thus, we must show that the formula for ¢ holds for all positive integers with exactly k+ 1 
different prime divisors. To this end, suppose 


ay 


n= pi"- po™ +++ pat + peg, 


where all the primes p; are different and each a; > 1. If we let 


a a ak are 
S= p+ po---pe* and t= pry’, 


then s and f¢ are relatively prime as p;+, is the only prime divisor of f, but pz+1 is not a divisor 
of s. Therefore, we can apply Lemma 7.16 to obtain the fact that d(n) = o(s-) = (s)-@(0). 
However, since both | and k belong to T, we know that the formula for ¢ holds for s and t. 
Thus, 


b(s) = pi! (py — 1) po® "(po — 1) +++ pye* (pe — D) 


and 


b(t) = prs! (peri — 1). 


As a result, we now have 


Pn) = $s) - OO = 
(ri "(pi — DY: po?" (po — D = pe (De = YD) De (peg — LD) = 


(pe — VD) pes (peg — 1). 


= 


a1 


pi '(p1 — 1) po" (po — 1) + pg 


Thus, k+ 1 € T, thereby proving the result. 


The Integers Modulon 255 
m Examples 
(100) = $(27-5”) = (27) -$(57) = 2'(2— 1) -5'(5— 1) = 40, 
(360) = $(2° -37- 5!) = (2°) - $37) - (5!) = 27(2— 1) -3'B —1)-5°(5— 1) = 96, 
(847) = (7! - 117) = @(7') -@(117) = 797 — 1) - 1111 — 1) = 660. 


Theorem 7.5 told us that if [a], ¢ U(Z,,), then there exists a positive integer m such that 
[a]”” = [1],. We can now show that the exponent ¢(n) works for every [a], € U(Zn). 


Proposition 7.18. In the group U(Zn), [ale = [1]n, for every [dln € U(Zn). 


Proof. We begin with a very common argument which we already used back in the proof of 
Theorem 6.8. If g,h, k are elements of a group G such that h # k, then we claim that gh 4 gk. 
To see this, observe that if gh = gk, then g~!- gh = g~!- gk, which would lead to the 
contradiction h = k. 


The preceding fact can be applied to U(Z,,) to assert that if [a], € U(Zy) and if 
[bi|n, [baln..--, [Bem)y|n are the d(n) different elements of U(Z,,), then the elements 


[an -[Bidn. [aln-(bedn,..-, [adn- [bgnyIn 


are also the @(n) different elements of U(Z,,). Since multiplication in U(Z,) is commutative, 
the product of the ¢(n) element of U(Z,,) gives us the same answer regardless of the order of 
the elements. In light of this, we now have 


[by)n-[b2]n--: [bgcnyIn = ([a]n-[Bidn) - Can [b2dn) +++ (an: [bgny|n)- 


If we let [C]n = [bin - (badn- - - [Bg(ny In, then by applying the associativity and commutativity 
of multiplication in Z,, we have 


[eln = [Diln- [Baln--: [bean = ([a]n-[1In)- (alan -[Baln) --- Can: [bea In) _ 


[aln?™ + (btn [Baln - + [Boqn)]n) = [aln? + [eln- 


If we multiply the preceding equation by [c],, ' we immediately see that [ae =[l]n. 


256 Chapter 7 
m Examples 


The group U(Zj2) has (12) = 4 elements, and they are [1]12, [S]i2, [7]i2, [11)12. 
Proposition 7.18 now tells us that 


(Wo =O 42 =O, (5)}) = [54h = [625]12 = (ir, 
(7Nio = (742 = [24012 = io, (110, = 1142 = 14642 = Hr. 


Similarly, the group U(Zo) has (9) = 6 elements, and they are [1]o, [2]o, [4]o, [5]o, [7]o, 
[8]o. Proposition 7.18 now tells us that 


[11$ =[1%]o =E1]o, [218 = [2°]o = [64]9 = [1]o, 
[4]$ = [4°]o = [4096]9 =[1]o, [518 = [5° lo = [15625] = [1]o, 


[71$ = [7°]lo = [117649] =[1]o, [818 = [8° lo = [262144]o = [1]o. 


7.4 Polynomials with Coefficients in Z,, 


In Chapter 5, we remarked that when R is a commutative ring, then so is R[x], the set of 

polynomials with coefficients in R. As a result, by looking at Z, [x], for various values of n, we 
obtain many additional examples of commutative rings. Before continuing, we need to discuss 
some of the notation and terminology associated with Z, [x] in order to make sure that we fully 
understand what elements of Z,,[x] look like. For example, some typical elements of Zs[x] are 


[3]sx7+([2]sx+[4]s, [4]5x°+[3]sx7,  [1]sx!4+ [2]5x7 + [4]5x° + [2]s. 


It is extremely important to understand that although the coefficients of the polynomials in 
Zs[x] are elements of Zs, the exponents are ordinary integers. Therefore, 


[8]sx’ = [3]sx’, 
since [8]5 = [3]5. On the other hand, 
[4]5x° 4 [4]5x°, 


since x® ¢ x3. Ordinarily, if a term in a polynomial is not the constant term and its coefficient 
is 1, then we need not write the 1. For example, in Q[x], we usually write 


3x7 +1x+7 as 3x7+x+7. 


www-.elsevierdirect.com 


The Integers Modulon 257 
Similarly, in Z5[x], we can write 
[3}sx°+[1]sx+[2]5 as [3]5x*-+x+[2]s. 


Furthermore, if the coefficient of a term in a polynomial is 0, then we usually skip writing the 
entire term. For example, in R[x], we can write 


6x —17x? 40x44 as 6x? —17x° +4. 
Similarly, in Z5[x], we can write 
[2]5x° + [O]sx°+[3]sx+[0]5 as [2]sx° + [3]sx. 
If every coefficient of a polynomial is the additive identity, then we simply write the 
polynomial as 0. 
Given an element a € Z, and a polynomial p(x) € Z,,[x], we can easily check if ais a root of 
p(x). For example, if we consider p(x) = x? +[2]sx+[1]s € Zs[x], then [4]5 is a root because 
p(I4]s) = [415 + [215 - [415 + [1]s = [16]s + [8]5 + [1s = (1s + [31s + [11s = [5]s = [0]s. 


However, [3]5 is not a root as 


PIs) = [315 + [2]5- [31s +s = [91s + [61s +s = [415+ Us + Us = 
[6]s =[1]s F[0]s. 


As we consider examples where Z,, is not a field, some new and surprising things occur. In 
Zs[x], consider the polynomial g(x) = x? +[7]g. You can check that [1]g, [3]s, [5]g. [7]s are all 
roots of g(x). Thus, g(x) is a polynomial of degree 2 with four different roots. This is a 
phenomenon that does not occur when we look at polynomials with coefficients in Z, Q, R, or 
C. In fact, as we will see in Chapter 12, the number of roots that a polynomial can have in a 
field cannot exceed the degree of the polynomial. But as this example shows, if the coefficients 
belong to a commutative ring that has zero divisors, the situation can be very different. 


Next, in Zg[x], consider the product of the polynomials h(x) = [2]gx? + [6]gx and 
J(x) = [4]gx. We now have 


h(x) + j(x) = ([2]gx? + [6]gx) ([4]gx) = [8]gx° + [24]gx7 = [0]gx° + [0]gx” = 0. 


In this example, we multiplied a polynomial of degree 2 with a polynomial of degree 1 and 
received 0 as the answer. In the past, when we multiplied polynomials, we added their degrees. 
This will certainly still be the case if our coefficients belong to a field. But, once again, if we 
are dealing with commutative rings with zero divisors, interesting new things can occur. 


258 Chapter 7 


Since Z, is field, for every prime number p, many of the algebraic ideas and techniques used 
to find roots of polynomials or solve equations in Z,[x] will be familiar to you. For example, 
consider the equation 3x + 4 = 3 in Q[x]. To solve this, we first subtract 4 from both sides to 

obtain 3x = —1 and then divide by 3 to conclude that x = — i. If we restate the steps taken in 
terms of addition and multiplication, we can say that we first added —4 to both sides and then 
multiplied by +. 


Now let us examine a similar situation in Zs[x] and consider the equation 
[3]sx+[4]s = [3]s. 
Similar to the preceding situation, we first add [—4]5 to both sides to obtain 
[3]sx = [3]s +1—4]s5 = [—1]s = I4]s. 


In Zs, the multiplicative inverse of [3]5 is [2]5. Therefore, in this case we multiply the 
previous equation by [2]s5 to obtain 


x =[2]5-[4]5 =[8]5 = [3]s. 


As always, it is a good idea to check one’s answer and if we plug x = [3]s5 into our equation, 
we have 


[3]5-[3]5+[4]s = [9]5 +[4]5 =[13]s = [3]s, 


as desired. 


When looking for the roots of a polynomial, the first thing we often do is to try to factor the 
polynomial. For example, if we are looking for the roots in R of x? —7x + 10, we can factor 
it as 


(5) x* —7x+10= (x—2)(x—5). 


Replacing x by either 2 or 5 in (5) makes one of the factors of the right-hand side equal to 0. 
Since multiplying by 0 in a commutative ring always yields a product of 0, we can see that 
both 2 and 5 are roots of x” — 7x + 10. The same reasoning tells us that whenever we factor a 
polynomial into polynomials of smaller degree, any root of one of the factors must be a root of 
the original polynomial. 


On the other hand, if in (5) we replace x with a real number other than 2 or 5, then both x — 2 
and x —5 will be nonzero real numbers. Since R has no zero divisors, the value of 

x? —7x+10 = (x —2)(x—5) will also be a nonzero real number. Thus, the fact that R has no 
zero divisors is the reason that 2 and 5 are the only roots of x? —7x+10 in R. Since Z, Q, R, 
and C all contain no zero divisors, the same reasoning tells us whenever we can factor a 


The Integers Modulon 259 


polynomial in Z[x], Q[x], R[x], or C[x] into linear factors, we can then easily read off all of 
its roots. 


Next, suppose we examine the polynomial x* +[1]s in Zs[x]. Observe that in Zs[x], this 
polynomial does indeed factor, and we have 


x? + [1]Js = (x + [2]5)(x+ [3]s). 


Since Zs also has no zero divisors, in order for an element of Zs to be a root of x7 +[1]s, it 
must make either x + [2]5 or x + [3]5 equal to zero. Thus, the only roots in Zs of x? +[1]5 are 
[3]5 and [2]s. 


Earlier we saw that [1], [3], [5]g, and [7]g are all roots of the polynomial x? +[7]g in Ze. 
Note that if we factor x* + [7]g as (x +[1]s)(x+[7]g) in Zg[x], then it is clear that [7]g and 
[1]g are roots. However, these are not the only roots in Zg. Observe that there are elements in 
Ze that fail to make either (x + [1]g) or (x +[7]g) equal to zero yet are still roots of x7 +[7]g. 
In particular, if we let x = [3]g, then (x +[1]s)(x+[7]g) becomes 


([3]s + [1]s)([3]s + [7]s) = (41s) (L10]s) = [4]s - [2]3 = [8]s = [O]s. 
Similarly, if we let x = [5]g, then (x+ [1]s)(x+[7]s) becomes 
([5]g + [1]s)(I5]s +(7]s) = (61s) ((12]s) = [6]s - [4]3 = [24]3 = [0]s. 


Another surprising aspect of this example is that not only can we factor x” + [7]g as 

(x +[1]g)(x+[7]g) in Zg[x], but we can also factor it as (x + [3]g)(x + [5]g). Thus, in the 
presence of zero divisors, there may be more than one way to factor a polynomial into linear 
factors. Hopefully, this example helps to illustrate why the number of roots of a polynomial 
can exceed its degree when working in a commutative ring with zero divisors. 


In your earlier algebra courses, when you were asked to find all the roots of a polynomial, the 
first thing you probably did was to try to factor the polynomial into linear factors. When doing 
this, you weren’t concerned with concepts and terms like commutative rings and zero divisors. 
But we can now see that using the linear factors to determine all the roots of a polynomial 
required that the coefficients of our polynomial belonged to a commutative ring without zero 
divisors. 


In this chapter we observed that since Zg has zero divisors, working with polynomials in Zg[x] 
was very different from our previous experiences with Z[x], Q[x], R[x], and C[x]. However, 
Z, has no zero divisors whenever p is prime. Therefore, working with polynomials in Z [x] 
will be quite similar to our experiences with Z[x], Q[x], R[x], and C[x]. In fact, in the next 
chapter, we will exploit properties of Z,[x] to better understand the roots and factoring of 
polynomials in Z[x] and Q[x]. 


260 Chapter 7 


Exercises for Sections 7.3 and 7.4 


1. 


ee SE a 
Sh ON OO Be GENS aes 


18. 


19. 


20. 


oe PFN nw FSF Y NS 


Compute $(3), 6(9), (27), (81), and $(243). 

Compute $(7), (49), (343), 6(2401), and 6(16807). 
Compute $(6), (18), (54), (162), and 6(486). 

Compute $(14), 6(98), 6(686), (4802), and (33614). 
Compute $(4900). 

Compute $(64, 000, 000). 

Compute $(7!). Recall that if n € N, thenn! =1-2---(n—1)-n. 
Compute #(8!). 

Compute #(2° -34-5°-77-11'), 

Compute (5° -74- 137-174). 

Prove that if n > 2 is an integer, then (7) is even. 

Prove that if n > 3 is an odd integer, then @(2n) = ¢(n). 
Prove that if n > 2 is an even integer, then @(2n) = 26(n). 
Find all integers n > 2 such that (mn) = 2. 

Find all integers n > 2 such that ¢(n) = 4. 

Find all positive integers n such that @(n) is a prime number. 


Let n > 2 be an integer and let p be a prime number. 
(a) Prove that if p does not divide n, then (pn) = (p— 1)d(n). 


(b) Prove that if p divides n, then (pn) = pd(n). 


If n EN, then w € C is called a primitive nth root of 1 if every other nth root of 1 
in C can be written as w raised to some positive integer power. For example, —1 is 
the only primitive square root of 1 in C, whereas i and —i are the only primitive 4th 
roots of 1 in C. Show that if n > 2, then there are exactly #(n) primitive nth roots 
of lin C. 


Compute the values of p(x) = x? + [3]4x + [2]4 in Z4 as x takes on the values 
[O]4, [1]4, [2]4, [3]4. 


Compute the values of q(x) = x? + [5]6x + [2]« in Ze as x takes on the values 
LO]6, [1]6, [2]o, (36, 416, Io. 


The Integers Modulon 261 


In exercises 21-24, find the solutions in Ze of the given equations. Observe that since Ze has 
only six elements, trial-and-error is often a reasonable way to proceed. 


21. [2}6x+ [5] = [3le 

22. [2]6x+[5]6 = [4] 

23. 2° += [Ole 

24. x°+x4+[5]o = [0]o 

In exercises 25-28, you will be doing computation in the ring Z7[x]. 
25. ([4]7x+[2]7)- ([37x* + [57x + [6]7) = 

26. ((5]7x? + [2]7x + [1]7)- (16]7x7 + [4]7x) = 

27. («+[5]7)? = 

28. ((2)7x +[3]7) -@+[6]7)- (57x) = 

In exercises 29-32, you will be doing computation in the ring Zo0[x]. 
29. ([3}iox+[8]io) - (70x? + [S}i0x + [4]10) = 

30. ([5}iox* + [8]iox+ [3]10) - (4}10x7 + [8]10x) = 

31. (x+[4]10)? = 

32. ([6]10x + [7]10) - (S)i0x + [2]i0) - (7 )i0x + [4]io) = 

In exercises 33-36, find a solution to the linear equation in field Z,. 
33. [2)ux+(9]u1 = [6]11 

34. [Tux+(10)in = [4] 

35. [6] (3dux+(2)i) =O 

36. [8}i1((3dn1% + (5)i1) = (3) (91x + [6]11) 


In exercises 37—40, determine if the quadratic polynomial can be written as a product of two 
linear polynomials in Z5[x]. 


37. x*+[2]sx+[2]s 
38. [4]sx*+[2]sx+[4]s 
39. x*+[3]sx+[3]5 
40. [3]sx? + [3]sx+[4]s 


262 Chapter 7 


In exercises 41-44, determine if the quadratic polynomial can be written as a product of two 
linear polynomials in Z7[x]. 


41. 
42. 
43, 
44. 
45. 


46. 


47. 
48. 
49. 
50. 
ope 
52. 


53. 
54. 


55. 


56. 


x? +[5]7 

[3]7x* + [5]7x+[6]7 
x? + [4]7x+[5]7 
[2]7x? + [5]7x+[6]7 


Let z : Z2[x] > Z2[x] be defined as m(f(x)) = f(x)’, for all f(x) € Zo[x]. 
(a) Show that z( f(x) + g(x)) = 2(f(x)) + 2(g(x)), for all f(x), g(x) € Zo[x]. 


(b) Show that m( f(x) - g(x)) = m(f(x))- 1(g(x)), for all f(x), g(x) € Zalx]. 
(c) Show that a* =a, for alla € Zp. 
(d) Use parts (a), (b), and (c) to show that m(h(x)) = h(x’), for all h(x) € Zo[x]. 


Let p: Z3[x] > Z3[x] be defined as p( f(x)) = f(x)’, for all f(x) € Z3[x]. 
(a) Show that p( f(x) + g(x)) = p(f(x)) + p(g(x)), for all fx), g(x) € Zs[x]. 


(b) Show that p( f(x): g(x)) = p(f))- e(g(x)), for all f(x), g(x) € Za[x]. 

(c) Show that a? =a, for all a € Z3. 

(d) Use parts (a), (b), and (c) to show that p(h(x)) = h(x3), for all h(x) € Z3[x]. 

List all the monic polynomials of degree 2 in Zo[x]. 

In Zp, find the roots of each of the polynomials in your answer to the previous exercise. 
List all the monic polynomials of degree 2 in Z3[x]. 

In Zs, find the roots of each of the polynomials in your answer to the previous exercise. 
List all the monic polynomials of degree 3 in Zo[x]. 


In Zp, find the roots of each of the polynomials in your answer to the previous 
exercise. 


If p is a prime number and a € Z is relatively prime to p, show that [ayo =[1]p. 


Use exercise 53 to show that if p is a prime number, then all p elements of Z, are roots 
of the polynomial x? + [p—1]px. 
Use exercise 54 to show that if p is a prime number, then the polynomial 


xP +[p—1],x+[1], has no roots in Zp. 


Let G be an abelian group with n elements. If we let e denote the identity element of G, 
prove that g” = e, for all g € G. (Hint: Think about the proof of Proposition 7.18.) 


The Integers Modulon 263 


57. Let F bea field with n elements and let 1 denote the multiplicative identity of F. 
(a) Show that a” =a, for all a € F. You might want to look back at exercise 56. 


(b) Use part (a) to conclude that none of the elements of F are roots of the polynomial 
x"—x+l1. 


In exercises 58-66, we will look at another way to find the roots of quadratics in Zp. 
Remember that when you first saw the quadratic formula, it was probably derived by first 
completing the square. We can now apply similar techniques to look for the roots of 
polynomials with coefficients in Z,, where p > 2 is a prime number. For example, suppose we 
want to find the roots of [2]7x* + [3]7x +[5]7 in Z7. We begin with the equation 

[2]7x? + [3]7x + [5]7 = 0 and, in order to make the polynomial monic, multiply both sides of 
the equation by the multiplicative inverse of [2]7 to obtain x? + [5]7x +[6]7 = [0]7. Next, we 
add the multiplicative inverse of [6]7 to both sides to obtain x* + [5]7x = [1]7. Recall that at 
this point in your earlier algebra classes, you would take half of the coefficient of the x term 
and add its square to both sides to turn the left side of the equation into the square of a linear 
polynomial. The analogous thing for us to do is to multiply [5]7 by the multiplicative inverse 
of [2]7 to obtain [6]7 and then to add the square of [6]7 to both sides, resulting in 

x? + [5]7x +[1]7 = [2]7. Observe that the left side of the equation is now the square of 
x+[6]7. Thus, we are now dealing with the equation (x + [6]7)* = [2]1. Next, note that [2] is 
a square in Z7 as [315 = [417 = [2]7. Therefore, our equation now becomes 

(x+[6]7)* = [315 — [4]7. This reduces to solving in Z7 the linear equations x + [6]7 = [3]7 and 
x + [6]7 = [4]7, which leads to the solutions x = [4]7 and x = [5]7. Keep in mind that in Z,,, 
where p > 2 is a prime, not every element is a square. Thus, as you attempt to complete the 
square, you will find that some quadratics do not have a solution in Zp. 


58. In Z7, find the square of each equivalence class. 


59. Find the roots in Z7 of x? +.x+[1]7 by completing the square and using exercise 58. 


60. Find the roots in Z7 of [3]7x? + [4]7x+[3]7 by completing the square and using 
exercise 58. 


61. Find the roots in Z7 of x* + [5]5x+[2]7 by completing the square and using exercise 58. 


62. Find the roots in Z7 of [5]7x? + [3]7x + [4]7 by completing the square and using 
exercise 58. 


63. Explain why the technique of completing the square in Z, requires that the prime p must 
be greater than 2. 


64. In Zj,, find the square of each equivalence class. 


65. Find the roots in Z1; of [4] 1x7 +[6].x+[1] 1 by completing the square and using 
exercise 64. 


66. Find the roots in Z1, of [4] 1x7 + [6]1.x+[8]11 by completing the square and using 
exercise 64. 


This page intentionally left blank 


Group Theory 


In the first seven chapters of this book, we examined Z, Q, R, C, Z,, and other examples of 
commutative rings and fields. Along the way, we also looked at the bijections of various sets, 
such as the roots of polynomials. 


Groups are algebraic objects that occur, often in more than one way, in each of the examples 
just mentioned. We have briefly touched upon groups in Chapters 2, 5, and 7. In light of the 
essential role played by groups, especially finite groups, in Galois’ proof of the insolvability of 
the quintic, it is time for groups to take center stage. 


In the first section of this chapter, you will see some definitions and examples that have 
appeared in earlier chapters. However, repetition can be a good thing. The structure of finite 
groups is more abstract and theoretical than the topics you examined earlier in this book. 
Therefore, seeing some definitions and examples for a second time should help you deal with 
this greater degree of abstraction. When you finish this chapter, not only will you have seen 
the crown jewel of an introductory course in group theory, Sylow’s Theorem, but you will 
also be equipped with all the group theory needed to understand Galois’ work on fifth-degree 
polynomials. 


8.1 Definitions and Examples 


We begin this section with the definition of a group, which you have already seen as 
Definition 5.19. 


Definition 8.1. Let G be a set with a binary operation o. We say that G is a group if o 
Satisfies the following properties: 


1. Associative Law: For every x, y,z € G, (xoy)oz=x0(yozZ). 


2. Identity: There is an element in G, usually denoted as e, such that xoe= x = eox, for 
every x EG. 


3. Inverses: For every x € G there is an element in G, usually denoted as x~', such that 


1 1 


XOox ° =e€=xX ox. 


Copyright © 2010 by Elsevier Inc. All rights reserved. 265 


266 Chapter 8 


If G is a group, we let |G| be the number of elements in the set G and call it the order of the 
group G. More generally, for subsets M of G, we let |M| denote the number of elements in M. 
In light of the experience we gained in earlier chapters, we will begin looking at examples of 
groups by focusing on three large classes: 


(1) Commutative Rings and Fields under Addition 
(I) Invertible Elements in Commutative Rings under Multiplication 


CII) Bijections of Sets 


|. Commutative Rings and Fields under Addition 


Recall that if R is a commutative ring, then addition is a binary operation that satisfies the 
three conditions required to be a group. Multiplication is also a binary operation on a 
commutative ring, but as we will see a little later, commutative rings do not form a group 
under multiplication. However, since every commutative ring is a group under addition, all ten 
sets on the following list are groups under addition: 


(a) Z, the set of integers 

(b) Q, the set of rational numbers 

(c) R, the set of real numbers 

(d) C, the set of complex numbers 

(e) Z,, the set of integers modulo n 

(f) Q[], the set of polynomials with rational coefficients 

(g) R[x, y], the set of polynomials in two variables with real coefficients 
(h) Q@)={at+bi|a,beQ} 

(i) Zi) ={a+bi|a,b€Z} 

(j) QG, V2) ={at+bit+eV2+diV2 | a,b,c,d€-Q 


At this point, there is another important group under addition worth noting: If n € Z, let 
nZ = {na |a€ Z}. Although nZ is not a ring, as it does not contain the identity element under 
multiplication, it will turn out to be a very useful group in motivating various concepts. 


Il. Invertible Elements in Commutative Rings under Multiplication 


If R is a commutative ring, then multiplication is also a binary operation. However, R is not a 
group under multiplication, as elements of R can fail to have a multiplicative inverse in R. 


Group Theory 267 


Indeed, since 0-r = 0, for all r € R, it is impossible to multiply 0 by an element of R to obtain 
1 as the answer. Thus, the additive identity of R does not have a multiplicative inverse, so R is 
not a group under multiplication. 


Therefore, in our search for groups, it makes sense to consider the nonzero elements of 
commutative rings. Indeed, if R is a field, then the nonzero elements of R are a group. 
However, for other commutative rings, the situation can be somewhat different. If we 
consider Z, then 1 and —1 are the only integers that have a multiplicative inverse in Z. 
Remember, it is not enough that the nonzero elements of Z have multiplicative inverses in Q. 
For a set to be a group, the inverse of every element needs to belong to the set and not to a 
larger set. Yet another problem arises when we consider the ring Ze. Note that [2]., [3]e, [46 
are zero divisors in Ze, so they are nonzero elements that do not have inverses in Ze. In light 
of the examples of Z and Ze, it now makes sense to consider the elements in a commutative 
ring that have multiplicative inverses in the ring. In fact, in Proposition 7.7, we showed 

that U(R), the set of elements in R with a multiplicative inverse in R, is a group under 
multiplication. Therefore, the following sets are all groups under multiplication: 


(a) {1, —1}, the set of invertible elements in Z 

(b) ©Q*%, the set of nonzero rational numbers 

(c) R*, the set of nonzero real numbers 

(d) (C%, the set of nonzero complex numbers 

(e) {[1]s, [2]5, (31s, [4]5}, the set of nonzero elements of the field Z5 
(f) {[1]o6, [S]e}, the set of invertible elements in Z, 

(g) {[1)io. [3)i0, [7Ji0, [9]10}, the set of invertible elements in Z19 
(h) Q*%, which is also the set of invertible elements in Q[x] 

Gi) {1,—1,i, —i}, the set of invertible elements in Z(7) 


(j) {1,—1}, which is also the set of invertible elements in Z[x] 


Ill. Bijections of Sets 


In Section 2.4, we showed that if S' is a set and o represents composition of functions, then o is 
a binary operation on each of the following sets: 


(a) F(S), the set of functions from S$ to $ 


(b) Jnj(S), the set of injective functions from S to S$ 


268 Chapter 8 


(c) Sur(S), the set of surjective functions from S to $ 
(d) Bij(S), the set of bijective functions from S to $ 


If we look back at Theorem 2.16, we can see that it verifies that Bij(S) is a group under o. 
We now take a look at the sets F(S), Inj($), and Sur(S). Theorem 2.11 asserted that the 
composition of functions is associative. Furthermore, F(S), Inj(S), and Sur(S) all contain the 
identity map e. Therefore the only thing that could prevent F(S), Inj(S), Sur(S) from being 
groups under o is if at least one element failed to have an inverse. 


If we consider the very special case where S has only one element, then F(S), Inj(S), Sur(S), 
and Bij(S) are all the same and consist only of the identity map e. Thus, in this case and 

only in this case, F(S), Inj(S), Sur(S), and Bij(S) are all groups. Now let us consider the case 
where S contains more than one element. If t € S, consider the function defined as f(s) =f, 
for all s € S. Since S has more than one element, the function f takes on the value of t more 
than once. Thus, f does not have an inverse f~!, as f~!(#) would not be defined. Thus, as 
long as S has more than one element, F(S) is not a group. 


We now turn our attention to /nj(S) and Sur(S) and consider the important case where S is 
finite. If S has n elements and f € Inj(S), then the range of f consists of n different elements 
of S. Thus, the range of f is all of S, so f is also surjective. 


On the other hand, if S has n elements and g € Sur(S), then g cannot repeat any values. To see 
this, observe that if g gave us the same value for two different elements of S, then the range 
of g would have at most n — 1 elements of S. Hence, g € Inj(S), and we have shown, when S$ 
is finite, that Inj(S) = Sur(S) = Bij(S). Thus, in this case, Inj(S), Sur(S), and Bij(S) are all 
groups. 

When S is infinite, there are elements of /nj(.S) that are not surjective and elements of Sur(S) 
that are not injective. However, for an element of F(S) to have an inverse, it must be both 
injective and surjective. Thus, when S is infinite, FCS), Inj(S), and Sur(S) are not groups. To 
better illustrate this, consider the following. 


m= Example 
Let S =N, the set of natural numbers, and consider the function defined as 
Sx) = 2x, 


for all x EN. It is easy to see that f € Inj(N) but f ¢ Sur(N). Observe that | is not in the 
range of f, so f—! does not exist as f~!(1) is not defined. Since f does not have an 


inverse, Inj(N) is not a group. 


Group Theory 269 


However, before leaving this example, it is worth considering the function g: N—> N 
defined as 


g(x) =x, whenxisodd and g(x)= 2 when x is even. 


We can see that g € Sur(N), but g ¢ Inj(N), as g(1) = 1 = g(2). Since g repeats values, 
g—! does not exist, as g~!(1) is not defined. Thus, Sur(N) is not a group. On the other 
hand, it is interesting to observe that if x € N, then 


(go f)(x) = g(f(x)) = g(2x) =x=e(x), 


where é is the identity. 
As a result, go f =e, which makes it look somewhat like f and g invert each other. 
However, 


(fog)Q) = f(e0)) = f=241=e0). 


Thus, fog #e. This indicates that when dealing with binary operations, to show that an 
element has an inverse, you do indeed need to check that things work on both sides. 


We will now look more closely at the group Bij(S) for some small sets §. When S is finite, we 
will stop using the notation Bij(S) and will adopt some terminology that is more common in 
abstract algebra. 


Definition 8.2. [fn €¢ N and A= {1,2,...,n}, let S, denote the group of bijections from A to 
itself. We refer to S, as the symmetric group or call it the symmetric group of degree n. 


1: 


m= Examples 


Ifn = 1 and A = {1}, then S; consists solely of the identity map e, where e(1) = 1 and 
ece=e, 


Ifn =2 and A = {1, 2}, then S2 consists of two elements. We will call these elements 
e, f, and they are defined as 


alj=1, 22)=2, 
fD=2, fA=1. 


Observe that eoe=e,eof=f, foe=f,and fo f =e. Note that each element of 
S> Is Its Own inverse. 


www-.elsevierdirect.com 


270 Chapter 8 


3. Ifn=3and A = {1, 2, 3}, then S3 has six elements, which we will call e, f g,h, j,k, 
and they are defined as 


ej=1,. eC)=2, .2G)=3, 
fY=1, f2=3, fGB)=2, 
gs) =3, g@2)=2, gG)=1, 
AQ)=2,. AC)Y=1,. AG)=3, 
JM)=2, j2)=3, jG)=1, 
ki)j=3, k2)=1, 4G) =2: 


We can represent $3 and 0 in the following table: 


° e f g h J k 
e e f g h J k 
f f e J k g h 
g g k e J h f 
h f J k e f g 
J J h f g k e 
k k g h fF e J 


For example, to find goh, we look at the term on the table in the same row as the g 
under o and the same column as the / to the right of 0 to obtain goh = j. Similarly, 
to find ho g, we look for the term on the table in the same row as the / under o and 
the same column as the g to the right of o to obtainhog=k. 


In all the examples of groups that come from commutative rings and fields, the order of the 
terms does not matter when applying the binary operation. Therefore, the fact that goh Ahog 
in $3 is worth taking note of. This leads to 


Definition 8.3. If G is a group such that xo y= yox, forall x, y € G, we say that G is 
abelian. When a group is not abelian, we call it nonabelian. 


Therefore, S; and S> are abelian and $3 is nonabelian. In fact, as we will now see, S,, is 
nonabelian whenever n > 3. If this looks familiar, this is merely a reappearance of 
Proposition 5.20 using somewhat different notation. 


Proposition 8.4. S,, is nonabelian, for all n > 3. 


www-.elsevierdirect.com 


Group Theory 271 


Proof: Let x, y € S, be defined as 
xQ)=2, xQ2)=1, x(3)=3, xm) =m, for any m > 3, 
yd)=1, yQ2)=3, y3)=2, yim)=m, forany m > 3. 
Therefore, 
(xo y)(1) = x(y(1)) = x1) = 2, 
whereas 
(yox)(1) = ya) = y@) =3. 


Hence, xo y € yox and so, S,, is nonabelian. 


As you probably noticed, the definition of a group states that the identity element is usually 
written as e, and the inverse of the element x is usually written as x~!. You have also noticed 
that when the context clearly dictates otherwise, we are willing to abandon this terminology. 
For example, when looking at the group Q under addition, we denote the identity as 0 and the 
inverse of a and —a. In fact, it would be silly and needlessly confusing to do otherwise. 


However, as a convenience, we will typically stick with multiplicative notation when dealing 
with groups. In fact, we will often leave out the symbol o and write ab in place of ao b. 
Similarly, we will use a” as a shorthand for aoa and a? as a shorthand for aoaoa. More 
generally, if n € N, then a” will be the shorthand for repeatedly applying o to n copies of a, 
and a~" will be the shorthand for applying o to n copies of this a~!. You should convince 
yourself that a~” is indeed that inverse of a”. By adopting this shorthand, the familiar rules of 
m+” and (a’”)" = a", for all m,n € Z, hold when doing 
computations in a group. In particular, you should convince yourself that if x belongs to a 
group G, then (x~!)~! = x. This means that not only is x7! the inverse of x, but x is also the 
inverse of x~!. 


exponents, such as aa" =a mye 


Given a commutative ring R, Proposition 5.15 provided us with an easy way to find new 
commutative rings contained in R. We would like to do a similar thing for groups. In particular, 
given a group G, we would like an easy way to produce new groups that are contained in G. 


Definition 8.5. Let G be a group with binary operation o. If H is a subset of G that is also a 
group under o, then we call H a subgroup of G. 


If H is a subset of G, then in order to show that H is a subgroup of G, we need to show that 
(i) HA is closed under o 
(ii) HA contains an identity element 


(iii) every element of H has an inverse in H 


272 Chapter 8 


Observe that since G is a group, we already know that o is associative, so we only need to 
check that H satisfies (1), (ii), and (iii). However, it would be nice if there was an easier way to 
show that H is a subgroup of G and that is the point of 


Proposition 8.6. Suppose G is a group under o and let H be anonempty subset of G. 
(a) Ifxoy ! EH, forall x, y € H, then H is a subgroup of G. 

(b) If H is finite and xoy € H, forall x, y € H, then H is a subgroup of G. 

(c) If H = K,\ Ko, where K, and K2 are subgroups of G, then H is a subgroup of G. 


Proof. For part (a), we will first show that H contains the identity element of G. Since H is 
nonempty, let x ¢ H. Therefore, 


xox !EH. 


1 


But since xox" =e, H does contain the identity element of G. 


Next, we need to show that H contains the inverse of each of its elements. However, if x € H, 
then the fact that both e and x belong to tells us that 


eox 'EH. 


1 


Since eox~! = x7!, H does contain the inverse of each of its elements. 


For the final part of (a), we need to show that H is closed under o. If x, ye A, letz= ee 


We know that z~! = y, and it follows from the previous argument that z € H. Therefore, 


xoz 'E€H. 


1 


But since xo z~* = xo y, we see that H is also closed under o. 


For part (b), suppose that H had the property that it contained the inverse of each of its 
elements. Then, if x, y € H, it would follow that yo € H, and we could conclude that 

xoy ! € H. This tells us that H satisfies the condition in part (a) of this result, so H would 
indeed be a subgroup of G. Thus, it suffices to show that whenever y € H, we have y"! € H. 


Consider the list 


y, ar 7 ees 


Observe that every element of this list belongs to the finite set H. Since the list is infinite, 
some element on this list must occur more than once. In fact, some element on this list must 
occur an infinite number of times. Therefore, there is some m > 1 such that y” occurs an 
infinite number of times on this list. However, this tells us that there is some n such that 


Group Theory 273 


n>m-+tland y" = y". If we let t =n —m -— 1, then £ is a positive integer, and we can rewrite 
ym = y" as 
yr=y"yeyl 


Observe that since t > 0, y’ is on the preceding list, so y’ € H. We will now exploit the 
properties of the group G to show that y’ is actually the inverse of y. The following argument 
might strike you as longer than necessary. However, at this stage, it is important to include 
every step and to see that the only properties we are allowed to use are those dealing with 
associativity, identities, and inverses. 


Since y” € H CG, y” has an inverse in G, which we can denote as z. Using multiplicative 
notation and multiplicative terminology, we can multiply both sides of the previous equation 
on the left by z to obtain. 
zy" =z-(y"-y-y'). 
Applying the associative law, we obtain 
zey™=(z-y")-(y-y'). 
Next, using properties of inverses and identities, the previous equation becomes 
e=e(y-y’) 
and then 
e=y-y. 


—1 


Since y~! € G, we can multiply this equation on the left by y~! to obtain 


y-e=y!-(y-y’). 


Associativity and the properties of identities and inverses now tell us first that 


y'=(y!-y)-y, 


then 


and finally 


ylay’ 


1 


Therefore, y’ is the inverse of y, so H does contain y~*, as desired. 


For part (c), suppose x, y € H = K,M K3; in light of part (a), it suffices to show that 
ae € K, K2. Since K, and K2 are both subgroups and x, y both belong to K; and Ko, it is 


274 Chapter 8 


immediate that y~! belongs to both K, and K. Hence, xy~! belongs to both K; and Kj, so 
xy! € H, as desired. 


Proposition 8.6 opens the door to countless examples of groups that arise as subgroups 
of groups we have already examined in this section. In particular, we are immediately led to 


Definition 8.7. If G is a group and g €G, let < g >={g" |n € Z}. We call < g > the cyclic 
subgroup generated by g. If there exists some a € G such that G =< a >, then we say that G is 
acyclic group. 


It is very important to check that the set < g > satisfies the conditions in Proposition 8.6(a) 
and is therefore a subgroup. Indeed, if g € G and x, y E< g >, then there exists n,m € Z 


such that x = g” and y= g”. We are using multiplicative notation and now have that 


yt = gr and 


xy! = gig — pe, 
Since n — m € Z, it follows that xy~! € < g >, so < g > is a subgroup of G. 


On the other hand, if G is a group and we are using additive notation, then < g > refers to the 
set {ng | n € Z}. This should be much clearer after we look at several examples. 


m Examples 


1. Zyo is a group with 10 elements in which we use additive notation. As we compute 
the cyclic subgroup generated by each of the 10 elements, note that different 
elements can generate the same cyclic subgroup. You should check that 


< [O]io >= {[O]io}, 
< [io > = < Bho > = <[7ho > = < [90 >= Zio, 
<[2]io > = < [4hio > = < [6li0 > = < [8hio > = 
{[O]io, [2]io, [4]i0, [6]10, [8]i0, }. 
< [5]io > = {[0]10, [S]io, }- 


Since there exist elements a € Zig such that < a >= Zyjo, we see that Zo is cyclic. 


2. Generalizing the previous example, we look at the groups Z,, for n > 2. Since 
< [Un >= Zn, all groups of the form Z, are cyclic. 


3. Zis also a group under addition and is a cyclic group as 


<l>=<-l1>=Z. 


www-.elsevierdirect.com 


Group Theory 275 


For any m € Z, <m > consists of all multiples of m. For example, < 2 > is the 
set of all even integers. Also note that ifa € Zand <a >= Z, then a must be 
either | or —1. 


Qis also a group under addition, but it is not cyclic. Ifr € Q, we can write r= Ff, 
where a € Zand be N. Note that _ ¢<r>,as there does not exist any € Zsuch 
that _ = n(¢). Since there are elements of Q not in <r >, it follows that Q is not 


Lyn 1 1 
cyclic. For example, < 3 >= {5 | € Z} and the subgroup < 3 > does not contain ¢. 


C™% is a group under multiplication. Here are examples of some of its cyclic 
subgroups: 


2is<Sii}, 
2215211, if, 


<i>=<-—i>=({l,-1,i,-i}, 


1 1 1 1 
<2>=j1, 2, ~, 4, -, 8, =, 16, ee 
2 4 8 


2 1, —2 ! 4 : 8 : 16 : 
<-L> 9 7 43 Ds » Go 95 ~G> > appre fe 
2 4 
Also, ifn € N, then < cis(22) > consists of the n nth roots of 1 in the complex 
numbers. 


The group C% is not cyclic. There are various ways to see this. One way is to note 
that if w € C* had the property that < a > contained —1, then some power of a 
would be equal to —1. However, it follows from DeMoivre’s Theorem that the length 
of a must be 1. But now DeMoivre’s Theorem asserts that every element of < a > 
also has length 1. As a result, we can see that if -—1 e<a>, then2 g<a>. 
Therefore, no cyclic subgroup of C* can equal all of C*. Hence, C% is not cyclic. 


Zs5* is a group under multiplication. Since it only contains four elements, it is easy to 
compute all its cyclic subgroups. 


< [1s > = {[1]s}, 
< [2]s > = <[3]s > =Zs”™, 
< [4]s > = {[1]s, [4]s}. 
Observe that Z5* is cyclic and is generated by both [2]s5 and [3]s. 


Earlier in this section, we looked at $3 and saw how each of the bijections 
e, f.g,h, j,k behaved on the set {1, 2, 3}. If you look back at the table for $3 under 


www.elsevierdirect.com 


276 Chapter 8 


composition of functions, it is not hard to compute its cyclic subgroups. 
<e>= fe}, 
<f>=te, f}, 
<g>=te, gh, 
<h>=f{e,h}, 


<j>=<k>={e, j,k}. 


When computing cyclic subgroups < g > of a group G, one often comes across a positive 
integer m such that g” = e, where e is the identity element. When this happens, the Well 
Ordering Principle tells us that there is a smallest positive integer n such that g” = e. In this 
situation, let us consider the elements 

€,8,¢,...,g" 1. 
If m € Z, then the division algorithm asserts there exist g,r € Z, with O < r <n—1 such that 
m = qn-+r. Since the ordinary rules of exponents hold, we now have 


m __ oqn+r qn or 


gag Sere =e) er See See Se: 


Therefore, every element of < g > appears on this list. Next, we claim that no element of 

< g > occurs more than once on this list. Observe that the only way the list could contain a 
repetition is if there exist integers s, t such that 0 < s <t <nand g’ = g°. But this implies that 
g’ * =e. Since t —s is a positive integer, this contradicts the minimality of n. Thus, all n 


elements of < g > on the list are different. Formalizing these observations, we have 


Definition 8.8. Suppose G is a group and g € G. If there exists a positive integer m such that 
g’” =e, let n be the smallest one with this property. In this case, we say that g has order n and 
denote this as o(g) =n. Furthermore, < g >= {e, g, g”,..., 2" |}. 


As is often the case, symmetric groups provide excellent examples of new concepts. 


Example 


Let {e, f.g,h, j, k} be the six elements of S3 that we described earlier in the section. 
Having computed all the cyclic subgroups of $3 earlier in the section, we now 
have o(e) = 1, o(f) = 2, o(g) = 2, o(h) = 2, o(/) = 3, and o(k) =3. 


Group Theory 277 


Having calculated all the cyclic subgroups of S3, we have seen that $3 is not cyclic. However, 
the next result makes it quite clear, without calculating cyclic subgroups, that $3 is not 
cyclic. 


Proposition 8.9. All cyclic groups are abelian. 


Proof: Suppose G is a group and G =< g >, for some g € G. We need to show that if 
x, y €G, then xy = yx. We know that there exist integers n,m such that x = g” and y= g”. 
Since the ordinary rules of exponents apply, we have 


m+n mon 


Hal Cae ae So Sa, 


as desired. 


Although all cyclic groups are abelian, there are certainly abelian groups that are not cyclic. 


m Example 


Let G = U(Zs); then G = {[1]s, [3], [5]s, [7]3}. Clearly, G is abelian. However, you 
should check that 


o([1]g) = 1, o([3]s) = 2, o((5]g) =2, and o([7]g) = 2. 


Therefore, none of the cyclic subgroups of G contain all four elements of G. Hence, 
U(Zs) is abelian but not cyclic. 


We will now look at various other subgroups of groups of the form Bij(S) and S,. 


m Examples 


In the next series of examples, we will be looking at some subgroups of Bij(R), for 
various commutative rings R. 


1. Let R be a commutative ring field and ifm € U(R), bE R, let 


Tn,pi R= R 


be defined as T,, ,(r) = mr-+b, forall r € R. It is not hard to see that the linear 
function T;, 4 is a bijection. Next, let 


G={Tyn.»|m € U(R), bE Rt}. 


www.elsevierdirect.com 


278 Chapter 8 


We claim that G satisfies the conditions of Proposition 8.6(a) and is therefore a 
group. If Tin.» € G, then = € U(R) and 4 € R, so we have 


(Fer) 209) =n SB) 


rb 
o( 8) berber 


m mm 


and 


mm? m 


(71 Tmo) =T1_2 Impl) = Trp (mr +b) = 
1 b b ob 

— )(mr+b)—-—=r4+—-—=r, 

m m m 


m 


forallr € R. Since Ti _» belongs to G and it is the inverse of T;,,, in Bij(R), G 
contains the inverse of each of its elements. 


Now suppose x, y € G; then there exist n,m, a, b € R, with n,m € U(R) such that 


x=T,q and y= Typ. Since y -l!—=T7T1 _», we now have 
~ im 


m? 


(yO) = (Tracts, 2) =Tra(T, 400) =Tra( 2-2) = 
A5-2)oe=(@)o(-2) 


forall r € R. Asa result, xy~! = Tn 4 nb. Observe that 7 =nm~ ' €U(R) and 


a— "© —a—nbm-' € R, thus xy~! € G. Thus, Proposition 8.6(a) tells us that G is 


m 


indeed a group. 


In this example, if R is a ring where | is not its own additive inverse—in other words, 
1 # —1—then the groups obtained in this way are not abelian. A direct computation 
shows that 


(T-1,0°T1,1)() = T-1,0(11,1)) = T-1,00.) = -1, 
whereas 
(T1,1° T-1,0)(0) = T1,1(7-1,00)) = 11,10) = 1. 
Since 1 4 —1, we see that T_1.9 07,1 # 7,1 0 T_1,0. Thus, G is not abelian. 


2. We continue to examine the preceding example. If Tiny, Tn.a € G give the same value 


when you plug in 0, it follows that b = a. If these functions also give the same value 


www-.elsevierdirect.com 


Group Theory 279 


when you plug in 1, then m =n. Asa result, each different ordered pair 
(m, b) € U(R) x R provides us with a different function. 


Next, ifm > 2, we will consider the special case where R = Z,; thus, 
G={Tnp|me U(Z,), b € Z,}. Since U(Z,) has @(n) elements and Z, has n 
elements, G is a nonabelian group with $(n)n elements. For example, ifn = 3, then 
G is nonabelian with 6 elements, whereas ifn = 4, then G is nonabelian with 8 
elements. 


3. Inthe previous paragraph, we examined G = {Tn,p | m € U(Zn), b € Zn}. Every 
element of G is a bijection of Z,, so we can think of G as a subgroup of S,. As we 
will see later in this chapter, every finite group can be viewed as a subgroup of S;, for 
some appropriate value of t € N. That is one of the reasons that symmetric groups 
are such an important part of finite group theory. 


Let us now look at the following subset of G, 


Ay = {In,b |m=([1], or[—1]n,b € Zp}. 


The set Hy is finite; in fact, it has exactly 2n elements. In order to show that H;, isa 
group, by Proposition 8.6(b), it suffices to show that H;, is closed under composition 
of function. 


Based on the experience you obtained working with elements of the form 
Tn,p € Bij(R), you should have little trouble verifying that 


Tia ° Tin.b = Lnm,nb+a- 


Therefore, ifm,n € {[1]n,[—Un} and a,b € Zp, it is easy to see that nm € {[1]p, 
[—1n} andnb+ae Zp. Thus, Thm.nb+a € Hn and Hp is indeed a group. In fact, based 
on the work we have done with groups of this type, we can see that H is a nonabelian 
group with 2n elements. 


The preceding constructions shows us a way, for every n > 3, to construct a nonabelian group 
with 2n elements that can be viewed as a subgroup of S,,. We will now examine how these 
nonabelian subgroups of S,, also arise in a geometric context. 


Suppose a jigsaw is used to cut an isosceles triangle, which is not equilateral, out of a piece of 
wood. We would like to analyze all the different ways in which that triangle can be removed 
from the wood, moved around in three dimensions, and then placed back into the piece of 
wood. Note that although the triangle must end up in its original location in the piece of 
wood, it need not be placed in its original position. To help keep track of any changes we make 
in the position of the triangle, we will label each vertex and corner of the triangle. The label 


www-.elsevierdirect.com 


280 Chapter 8 


for each vertex will be outside the triangle, whereas the label for each corner will be inside the 
triangle. Observe that the side of the triangle connecting vertex #2 to vertex #3 is shorter than 
the other two sides. 


When you place the triangle back into the wood, it is certainly possible that you placed it 
back in its original position. Notice that each corner of the triangle ends up at a vertex with 
the same number, so we can represent this motion by the element e € S3 where e(1) = 1, 
e(2) = 2, e(3) =3. 


However, there is another way you could position the triangle. Obviously, corner #1 must end 
up at vertex #1; otherwise, the triangle will not fit back into its original location. Now, think 
about the line beginning at vertex #1 and heading down the middle of the triangle. We can flip 
the triangle around this line, thereby switching the part of the triangle that faces up with the 
part that faces down. The result of this is 


Group Theory 281 


If we check where each corner of the triangle moves, we see that corner #1 remains at vertex 
#1, whereas corner #2 ends up at vertex #3 and corner #3 ends up at vertex #2. Therefore, we 
can represent this motion by the element f € S3, where f(1) =1, f(2) =3, fG) =2.Asa 
result, the ways in which we can move this triangle is described by the two elements {e, /}, 
which is a subgroup $3. Observe that the collection of ways in which we can move and 
reposition the triangle is not merely a set. It is a group because we can always follow one 
motion by another motion, and following one motion by another is really the same as 
composing functions in 53. 


For our next example, suppose an equilateral triangle is cut out of a piece of wood. Once 
again, we label the vertices and each corner of the triangle. 


As before, after each movement of the triangle, we can record where each corner of the triangle 
ends up. In this way, we again see that the motions of this triangle are a subgroup of $3. The 
question, at this point, is how many elements of $3 are obtained as motions of this triangle? 


Clearly, if we return the triangle to its original position, this motion corresponds to e € $3, 
where e(1) = 1, e(2) = 2, e(3) = 3. Next, consider the point at the center of the triangle. When 
we rotate the triangle 120° clockwise around this point, corner #1 ends up at vertex #2, corner 
#2 ends up at vertex #3, and corner #3 ends up at vertex #1. Therefore, this motion 
corresponds to j € $3, where j(1) = 2, j(2) = 3, j(3) = 1. 


If we instead rotate the triangle 240° clockwise around this point, we obtain the motion 
corresponding to k € $3, where k(1) = 3, k(2) = 1, k(3) =2. 


282 Chapter 8 


Next, consider the line starting at vertex #1. If we flip the triangle around this line, thereby 
switching the parts of the triangle that face up and down, we obtain the motion corresponding 
to f € $3, where f(1) = 1, f2) =3, f(3) =2. 


At this point, we have obtained four of the six elements of 53. We can obtain the final two 
motions by combining or composing some of the previous motions. For example, to obtain 
the motion corresponding to h € $3, where h(1) = 2, h(2) = 1, h(3) = 3, we first apply the 
motion corresponding to f, then the motion corresponding to j. 


Group Theory 283 


Finally, to obtain g € S3, where g(1) = 3, g(2) = 2, g(3) = 1, first apply the motion 
corresponding to f and then the one corresponding to k. As a result, we can see that the 
motions of an equilateral triangle is the group $3. 


A regular n-gon is an n-sided object in the plane where all the angles and all the sides are 
equal. Thus, an equilateral triangle is a regular 3-gon, and a square is a regular 4-gon. 
We would now like to extend our analysis to describe the motions of all regular n-gons, 
for n > 3. 


Given a regular n-gon, we begin by labeling each vertex and each corner of the n-gon in a 
clockwise fashion. By examining where a motion takes each corner, every motion of the 
n-gon corresponds to an element of S,. Since any motion can be followed by another motion, 
the set of motions of an n-gon is closed under composition, so by Proposition 8.6(b), it is a 
subgroup of S,. Our goal is to determine which elements of S,, belong to this subgroup. 

The types of motions we are discussing are commonly referred to as rigid motions or 
symmetries. 


Definition 8.10. [fn > 3, the nth dihedral group is denoted as D, and consists of all rigid 
motions or symmetries of a regular n-gon. 


In the next six diagrams, we will be looking at the rigid motions of a regular 12-gon. As we try 
to understand the situation for any n > 3, the number 12 should be large enough that these 
diagrams successfully illustrate the general situation. 


Given any rigid motion, we first examine what happens to corner #1. It can end up at any of n 
different vertices. Observe that after applying this motion, corner #1 must still lie between 
corners #2 and #n. Therefore, as we can see following, there are at most 2 rigid motions of a 
regular n-gon. 


284 Chapter 8 


wo 


or 


wo 
wo 


Now we will see that all 2n possibilities are indeed rigid motions of a regular n-gon. Consider 
the point in the center and look at the line starting at vertex #1 that passes through the center. 
Now, let x denote flipping the n-gon around this line, thus we have switched which part of the 
n-gon faces up and which faces down. Note that after applying x, corner #1 remains at vertex 
#1, but when we proceed clockwise around the n-gon, corner #n comes after corner #1 and 
not before. 


Next, let y denote rotating the entire n-gon 300° in a clockwise direction around the point in 


the center. Observe that this moves corner #1 to vertex #2 and keeps corner #2 after corner #1 
when moving in a clockwise direction. 


Group Theory 285 


Since x and y belong to the group of motions of a regular n-gon, so does any series of repeated 
applications of x and y. Observe that if we apply y to the n-gon i times, withO <i<n—1, 
then we obtain the motion that moves corner #1 to vertex #(i+ 1) and corner #2 comes 
immediately after the corner labeled #1. 


On the other hand, if we first apply x and then follow it by applying y! times, where 
0 <i<n-—1, then this motion moves corner #1 to vertex #(i + 1) while corner #2 now comes 
before corner #1. 


286 Chapter 8 


As a result, there are indeed 2n rigid motions of the regular n-gon and they are all of the form 
yixd , where 0 <i <n—1 and 0 < j < 1. Also observe that x and y do not commute as motion 
xy is not the same as motion yx but is instead equal to y”~!x. To summarize, we have shown 


Proposition 8.11. For every n > 3, the nth dihedral group is a nonabelian group with 
2n elements. Every element of D, can be represented uniquely in the form y'x/, where 
0<i<n—-1,0<j<1,y"=x* =e andxy=y""'x. 


We now address the issue what it means for two groups to be the “same”. To address this issue, 
we begin by looking at the tables for the group Z4 and Zs”*. The table for Z, is 


° [O]4 [1]4 [2]4 [3]4 
[O]4 [O]4 [1]4 [2]4 [3]4 
[1]4 [1]4 [2]4 [3]4 [O]4 
[2]4 [2]4 [3]4 [O]4 [1]4 
[3]4 [3]4 [O]4 [1]4 [2]4 
and the table for Z5~ is 
° [1]s [2]5 [3]5 [4]5 
[1]s [1]s [2]5 [3]5 [4]5 
[2]5 [2]s [4]5 [1]5 [3]s 
[3]5 [3]s [1]5 [4]5 [2]s 
[4]5 [4]5 [3]5 [2]5 [1]s 


Admittedly, the next step will appear to be unmotivated, but it is quite necessary. In the table 
for Zs*, we will swap the positions of [3]5 and [4]5 in both the column under o and the row to 
the right of o. This gives us a slightly different looking table for Zs 


° [1]s [2]5 [4]5 [3]s 
[1]s [1]s [2]5 [4]5 [3]s 
[2]s [2]s [4]5 [3]5 [1]s 
[4]5 [4]s [3]5 [1]5 [2]5 
[3]5 [3]5 [1]5 [2]5 [4]5 
Next, let 
: Za > Zs” 
be defined as 


P(0l4) =C1]s, OC14)=[2]s, O(2]4) =[4]s, O(13]4) = [3]. 


In the table for Zs* we will now represent each element of Z5* as the image of an element of 
Za under ¢ to obtain 


Group Theory 287 


° p([0]4) p([1]4) o([2]4) o([3]4) 
p([0]4) p([0]4) o([1]4) p([2]4) o([3]4) 
p([1]4) p(L1]4) p([2]4) o([3]4) p([0]4) 
o([2]4) p([2]4) o([3]4) p([0]4) b([1]4) 
o([3]4) o([3]4) p([0]4) p([1]4) o([2]4) 


Observe that the tables for Z4 and Zs” are identical except that every g in the table for 

Zz is replaced by $(g) in the table for Zs*. Not only does ¢ provide us with a one-to-one 
correspondence between the elements of Z4 and Zs”, it also makes the tables for these groups 
look essentially the same. 


Observe that if g, h € Z4 then the product in Z5* of @(g) and $(h) is &(g)b(h). Even though 
Za is a group under addition, we will use multiplication notation. Therefore, when we look at 
the table for Zs*, the element corresponding to the product of #(g) and $(h) is #(gh). Thus, 


b(gh) = b(g)o(h) 


and this equation is the key! 


What this means is that Z5* is essentially a carbon copy of Z4, except that the elements in Zs * 
have slightly different names. In Z4, we may call elements g and h, whereas in Z;*, we may 
call them ¢(g) and ¢(/). But the equation ¢(gh) = o(g)@(A) tells us that g and h add in Zy 
the same exact way that #(g) and ¢(h) multiply in Z5*. Thus, not only do Z4 and Zs* have 
the same number of elements, but each g € Z4 behaves the same way in Zz as $(g) does in 
Zs. We generalize this as 


Definition 8.12. Let G,, G2 be groups; G, and G2 are said to be isomorphic if there exists a 
bijection ¢ : Gj — G2 such that 


p(gh) = $(g) (A), 
forall g,he Gy. 


If @ is as in Definition 8.12, we say that @ is an isomorphism of groups. If two groups are 
isomorphic, then they are essentially the same object, except that the names of the elements 
in the two groups might be somewhat different. In particular, the table for G2 under o is 
identical to the table for G; except that each g in the table for G, is replaced by $(g). 


Not all groups of the same order are isomorphic. The groups Z4, Zs*, and U(Zsg) all have 
order 4, and we have already shown that Z4 and Zs* are isomorphic. However, these groups 
are not isomorphic to U(Zg). To see this, suppose we are in the situation where ¢ : Gj > G2 is 
an isomorphism of finite groups and Gj is cyclic. If g € G; such that G; =< g >, then we 
would expect that #(g) plays the same role in G2 that g plays in G;. Therefore, not only should 


288 Chapter 8 


G2 be cyclic, but it should also be the case that Gz =< $(g) >. Indeed, if y € Go, then there 
exists x € G, such that y = (x). Since G; is cyclic, there exists n € N such that x = g”. Using 
arguments similar to the ones in Chapter 5 in our discussion of complex conjugation and 
automorphisms, you should convince yourself that ¢(g”) = (¢(g))”. Therefore, we now have 


y= (x) = 6g") = (g))”. 


Thus, every y € G2 does indeed belong to < $(g) >, so G2 is cyclic. 


Since Z,4 and Zs5* are cyclic, any group to which they are isomorphic must also be cyclic. 
However, U(Zg) is not cyclic, so it is not isomorphic to Z4 and Zs5*. 


You should also convince yourself that if two groups are isomorphic and one of them is 
abelian, then so is the other one. Therefore, even though the groups Ze and $3 both have six 
elements, they are not isomorphic as Ze¢ is abelian and $3 is not. 


Earlier in this section, we remarked that the groups H, = {Tiny | m = [1], or [—1]n, b € Zp} 
also arise in a geometric context and are called D,. In the language of isomorphisms, we are 
asserting that for n > 3, H, is isomorphic to D,. You will be asked to verify this in the 
exercises. 


Suppose we define ~ on the set of groups by saying G; ~ Gz whenever G, and G2 are 
isomorphic. In the exercises, you will be asked to show that ~ is an equivalence relation on 
the set of groups. Among the things that researchers in the theory of finite groups try to do is to 
describe the structure of all finite groups. Having the concept of isomorphism at our disposal, 
we can be more formal about what this actually means. For every n > 1, mathematicians 
would like to determine the number of equivalence classes of groups of order n. Then, they 
would like to concretely describe one group in each equivalence class. Later in this chapter we 
shall see that, for certain values of n, this is fairly easy to do. But for some values of n, 
especially those with many different prime factors, this becomes a very, very difficult 
problem. 


Exercises for Section 8.1 


In exercises 1-26, determine if the set S is a group under o. Although o often represents a 
binary operation, in these examples we are not assuming that S is closed under o. If S is not a 
group, briefly explain why it is not a group. 


1. Sis the complex numbers of length 1 and o is multiplication. 
2. Sis the complex numbers of length 1 and o is addition. 


3. Sis the complex numbers of length 1 and o is division. 


SO’ 900 cls ON 


20. 


21. 
22. 


23. 
24. 


25. 
26. 


Group Theory 289 


S is the complex numbers whose length is a positive integer and o is multiplication. 


S is the complex numbers whose length is a positive rational number and o is 
multiplication. 


Sis the complex numbers whose length is a positive rational number and o is addition. 
S={a+bi|aeQ,b€ R} and o is addition. 
S={a+bi|aéQ,a#0,b € R} and o is multiplication. 

S = Zand o is subtraction. 

Sis the set of nonzero rational numbers and o is division. 

Sis the set of positive rational numbers and o is addition. 

Sis the set of positive rational numbers and o is multiplication. 
S = {a+bx+cx?|a, b,c € Z} and 0 is addition. 

S = {ax + bx*|a, b € Q} and 0 is addition. 

S = { f(x) € R[x]| f(O) = 0} and o is addition. 

S = { f(x) € R[x]| f() 4 0} and o is addition. 


S is the set of rational numbers of the form ; with a, b € Z and b odd where 0 is addition. 
S is the set of rational numbers of the form 7 with a, b € N and b odd where o is addition. 


S is the set of rational numbers of the form 7 with a, b € N and b odd where o is 
multiplication. 


S is the set of rational numbers of the form | with a, b positive odd integers and o is 
multiplication. 


S consists of those elements in S4 which send 1 to 1 and o is composition of functions. 


S consists of those elements in S4 which do not send 1 to 1 and o is composition of 
functions. 


S consists of those elements in $3 of order 1 or 2 and o is composition of functions. 


If G is an abelian group, let S be the elements in G of order | or 2 and o will be the 
binary operation in G. 


S consists of those elements in $3 of order 1 or 3 and o is composition of functions. 


If G is an abelian group, let S be the elements in G of order 1 or 3 and o will be the 
binary operation in G. 


290 Chapter 8 


Exercises 27-32 will be based on the following multiplication table for D3, the third dihedral 
group. Recall that every element of D3 can be written uniquely in the form y'x/, where 
0<i<2,0<j<1,and y = x*=e.In constructing the table, remember that the term xy can 
and should be replaced by y*x. 


27. 


28. 


29. 


30. 


31. 


32. 


° e x y yx ae WK 
e Xx y YX a yx 
x e ye yx y 

y y yx y yx  e x 

yx yx y x e yx 9 

y y yx  e cs y yx 

yx Noe. ae yx y 2 e 


Write down the elements in the cyclic subgroup generated by x and determine the order 
of x. 


Write down the elements in the cyclic subgroup generated by y and determine the order 
of y. 


Write down the elements in the cyclic subgroup generated by yx and determine the order 
of yx. 


Write down the elements in the cyclic subgroup generated by y* and determine the order 
of y’. 


Write down the elements in the cyclic subgroup generated by y”x and determine the 
order of yx. 


Show that the only subgroup of D3 which contains both y?x and y is D3. 


Exercises 33—43 will be based on the following partially completed multiplication table for 
Da, the fourth dihedral group. Recall that every element of D4 can be written uniquely in the 


form y'x/, where 0 <i<3,0<j< 1, and y* = x 


= e. In constructing the table, remember 


that the term xy can and should be replaced by y*x. 


fe) e Xx y yx ye yx y ex 
e e ep Fe Pe yx 
XxX XxX 
y y 
yx yx 
ye y 
yx y 
y y 
yx yx 


33. 
34. 
35. 
36. 


37. 


38. 


39. 


40. 


41. 


42. 


43. 


Group Theory 291 


Complete the two columns on the table that have x on top and y on top. 
Complete the two columns on the table that have yx on top and y* on top. 
Complete the two columns on the table that have y”x on top and y° on top. 


Complete the column on the table that has y*x on top and observe that you are placing in 
each row the only element of D, that was missing. 


Write down the elements in the cyclic subgroup generated by x and determine the order 
of x. 


Write down the elements in the cyclic subgroup generated by y and determine the order 
of y. 

Write down the elements in the cyclic subgroup generated by yx and determine the order 
of yx. 

Write down the elements in the cyclic subgroup generated by y” and determine the order 
of y*. 

Write down the elements in the cyclic subgroup generated by yx and determine the 
order of y*x. 

Write down the elements in the cyclic subgroup generated by y? and determine the order 
of y?. 


Write down the elements in the cyclic subgroup generated by y*x and determine the 
order of y>x. 


In exercises 44-52, let fg, h € S4 be described in terms of their behavior on the set {1, 2, 3, 4} 


as follows: 
fI)=1, fQ=4, fQ)=3, fa =2, 
sd) =2, g2)=3, gG)=1, sA=4, 
hd) =4, AQ)=3, AGB)=2, A4)=1. 
44. Compute the order of f. 
45. Compute the order of g. 
46. Compute the order of h. 
47. Express fog in terms of its behavior on {1, 2,3, 4} and then find the order of fo g. 
48. Express go f in terms of its behavior on {1, 2,3, 4} and then find the order of go f. 
49. Express f oh in terms of its behavior on {1, 2,3, 4} and then find the order of foh. 


292 Chapter 8 


50. 
51. 
52. 
53. 


54. 


55. 
56. 


57. 


58. 


59. 


60. 
61. 


62. 


Express ho f in terms of its behavior on {1, 2,3, 4} and then find the order of ho f. 
Express g o/h in terms of its behavior on {1, 2,3, 4} and then find the order of goh. 
Express ho g in terms of its behavior on {1, 2,3, 4} and then find the order of hog. 


Let G be a group with subgroups H, K. Suppose h € H and k € K such thath ¢ K and 
KH. 
(a) Prove that HU K does not contain hk. 


(b) Prove that HU K is not a subgroup of G. 


Use exercise 53 to show that if G is a group, then G is not the union of two subgroups 
that are not all of G. 


Show that the group U(Zs) is the union of three subgroups, none of which is all of U(Zg). 


If H,, H2,..., Hm are subgroups of a group G, show that the intersection H, 
HyM---O Hy» is also a subgroup of G. 


If G is a group and a € G, let 7: G > G be the function defined as z(g) = ag, for all 
geG. 
(a) Show that z is a bijection. 


(b) Show that if z is an isomorphism, then a is the identity element of G. 


If G is a group and be G, let 0: G > G be the function defined as p(g) = gb, for all 
geG. 
(a) Show that p is a bijection. 


(b) Show that if p is an isomorphism, then b is the identity element of G. 


If G is a group and a, be G, let t: G > G be the function defined as t(g) = agb, for all 
geG. 
(a) Show that Tt is a bijection. 


(b) Show that if t is an isomorphism, then a = ae 
(c) Show that if a= b7!, then tis an isomorphism. 
Explain how exercises 57 and 58 are special cases of exercise 59. 


Let d: Rt — Rt be defined as (x) = x’, for all x € R*, where R* is the set of positive 
real numbers. 
(a) Explain why Rt is a group under multiplication. 


(b) Is ¢ an isomorphism of groups? Explain your answer. 


Let p : R* — R% be defined as p(x) = x’, for all x € R*, where R% is the set of nonzero 
real numbers. Is p an isomorphism of groups? Explain your answer. 


63. 


64. 


65. 


66. 


67. 


68. 


69. 


70. 


Group Theory 293 


Let t : R* — R% be defined as t(x) = x, for all x € R*, where R% is the set of nonzero 
real numbers. Is t an isomorphism of groups? Explain your answer. 


Let v: Qt > Qt be defined as v(x) = x’, for all x € Q*, where Q* is the set of positive 
real numbers. 
(a) Explain why Q?* is a group under multiplication. 


(b) Is van isomorphism of groups? Explain your answer. 


Let G be a finite abelian group with no elements of order 2. Show that the function 
@: G— G defined as $(g) = g’, for all g € G, is an isomorphism. 


Suppose G is a finite abelian group and p a prime number such that G has no elements of 
order p. Show that the function t: G > G defined as t(g) = g?, for all g € G, is an 
isomorphism. 


Let @ : G; — Gz be an isomorphism of groups. Show that G, is abelian if and only if Gz 
is abelian. 


Let @: G; — G2 be an isomorphism of groups. 
(a) Ife; is the identity element of G, and e> is the identity element of G2, show that 


p(e1) = e2. 


(b) Ifn eN and g € G,, show that g has order n in G, if and only if ¢(g) has order n 
in Go. 


Define ~ on the set of groups as G; © G2 precisely when G, and G2 are isomorphic. 
Show that * is an equivalence relation. 


In this exercise, you will verify that for n > 3, the group 

A, = {Tn,p |m = [1p or [—1]n, B € Zy} is isomorphic to the nth dihedral group D,. 

(a) In D,, show that ifm € N then xy”x = y~”. When doing this, you may want to 
look at part (c) of exercise 59. 


(b) In D,, show that if 0 <i,k <n—1, and 0 < j,/ <1, then (y'x/)(y*x!) = 
yi+D'kyJ+1, Tt might help to consider the cases j = 0 and j = 1 separately. 


(c) In A, show that if 0 <i,k <n—1, and 0 < j,/ < 1, then 
T= nstin © Instn = Dn LHD * 
(d) Show that the function ¢: D, — Hy, defined as 
bx!) = Ti ile 


whenever 0 <i <n-—1 and 0 < j <1, is an isomorphism. 


294 Chapter 8 


8.2 Theorems of Lagrange and Sylow 


One way to try to understand the structure of a finite group G is to look at its subgroups. 
Cyclic subgroups were introduced in the previous section because they arise in a very natural 
way. We now look at another collection of subgroups that arise very naturally. 


Definition 8.13. If G is a group and g € G, let Cg(g) = {h € G| gh = hg}, and we call this 
set the centralizer of g in G. 


Observe that Cg(g) consists of the elements of G that commute with g. In particular, if G is 
abelian, then Cg(g) = G, for all g € G. To look at a nonabelian example, we turn our attention 
to $3. When no ambiguity arises, we will often write C(g) instead of Cg(g) to denote the 
centralizer. 


m Example 


In Section 8.1, we looked at the table for $3. If you take a look back at this table, you will 
see that 


Cle)= 83, C(f) ={e, f}, Clg) = fe, gt, 
C(h) = {e,h}, CU) ={e, ik}, CK) = {e, j,k}. 


Notice that the centralizer of each element of S3 is a subgroup of $3. We will now prove that 
this is merely a special case of the fact that in any group the centralizer of an element is always 
a subgroup. When reading the proof, keep in mind that the only properties we are allowed to 
use when doing computations with the elements of a group are the associative law and 
properties of the identity and inverses. When reading the proof, at various points, you should 
convince yourself that one line follows from the next using these properties and no others. 


Proposition 8.14. Jf G is a group and g € G, then C(g) is a subgroup of G. 


Proof. Let x, y € C(g); by Proposition 8.6(a) we need to show that xy! € C(g). We know that 
gx =xg and gy = yg. Multiplying this last equation on the right and left by y~!, we obtain 


1 


=y!(yg)yt. 


y'(gy)y 
Using the associative laws and properties of the identity and inverses, this simplifies to 


y eS ey 


Group Theory 295 


The previous equations combined with the repeated use of associativity tells us that 


1 1 


(xy7')g =x(y7'g) =x(gy') = (xg)y! = (gy! = g(xy7'). 


Thus, xy~! € C(g), as desired. 


Observe that, for any group G and g € G, it will always be the case that 

<g> CCg(g) CG. 
In our preceding examples, Cg(g) has always been equal to either < g > or G. However, we 
will see some examples where this is no longer the case. 


m Example 


Let D4 be the fourth dihedral group. Recall it consists of y'x/, where 0 <i <3,0<j<1, 


2 . ; 1 
x” = yt =e, and xy = yx. The table for this eight element group is 
2 2 3 3 

° e x y yx y~ yx yy yx 
e e x y yx y yx oy yrx 
x x e yx y- yx y yx y 
y y yx y yx y yx e x 

- <n 3 oe 2 
yx yx y x e yx ey yu esey 
y y~ yx y yx oe x y yx 
yx yx y yx y Xx e ye 

3 3 3 : 2 2 
y hs yr oe x y yx a Fe 

3 a 3 - 2 
yx 27: a yx y x e 


You should now do the computations needed to confirm that 
C(e) = C(y”) = D4, C(x)= C(y*x) = fe, x, y’, yx}, 
COI =CO) =f. WY Cox) =CO*x) = fe, yx, y°, ya}. 
| | 


Centralizers tell us which elements of a group commute with a particular element of the group. 
In examining the structure of a group, we are often interested in those elements that commute 
with every element of the group. 


Definition 8.15. If G is a group, let Z(G) ={g € G| gh =hg, for all h € G}. We call Z(G) 
the center of G. 


Observe that G is abelian if and only if G = Z(G). In some sense, the larger Z(G) is, the 
closer G is to being abelian. It should come as little surprise that Z(G) is also a subgroup of G. 


www.elsevierdirect.com 


296 Chapter 8 
Proposition 8.16. If G is a group, then Z(G) is a subgroup of G. 


Proof. This proof will be very similar to the proof of Proposition 8.14. Let x, y € Z(G); by 
Proposition 8.6(a) we need to show that xy~! € Z(G). Therefore, if h € G, we need to show 
that xy~! commutes with h. 


We know that hx = xh and hy = yh. Multiplying this last equation on the right and left by y~!, 


we obtain 


1 


=yl(yh)y. 


y ‘(hy)y 
Using the associative laws and properties of the identity and inverses, this simplifies to 
yth=hy!. 


The previous equations combined with the repeated use of associativity tell us that 


(xy~')h = x(y~'h) =x(hy7') = (xh)y"! = (hx)y 7! = A(xy!). 


1 


Thus, xy~! commutes with every h € G and xy~! € Z(G), as desired. 


Another way to look at Z(G) is as the intersection of every C(g) as g ranges through the 
elements of G. Using this observation and the tables for $3 and Dg, it is not hard to see that 
Z(S3) = {e} and Z(D4) = {e, y}. 


Let us now reflect on some of our examples. Keep in mind that every group G with more 
than one element always has at least two different subgroups, as < e > and G are always 
subgroups. 


m Examples 


1. $3 has order 6, and we have already seen that $3 has subgroups of order 1, 2,3, and 6. 


2. D4 has order 8. When we look at the subgroups < e > and Z(D4) = {e, y’} along 
with the preceding list of centralizers, we can see that D4 has subgroups of order 
1,2,4, and 8. 


In these examples, the order of every subgroup we found divides the order of the group. 
Furthermore, for every divisor of the order of the group, we can find a subgroup of that order. 
Although two examples may not appear to be an enormous amount of evidence, it does seem 
quite reasonable to ask the following two questions about the subgroups of a finite group G: 


Group Theory 297 


Questions: 
1. If H isa subgroup of G, must || divide |G|? 
2. Ifn eN and n divides |G|, must G have a subgroup of order n? 


We will now begin to try to answer the first question. To answer this and other questions, 
we need 


Lemma 8.17. Let G be a group where M C G andaeé G. Next, let Ma = {xa | x € M} and 
letaM = {ax|x eM}. 


(a) If M is finite, then the sets M, Ma, aM all have the same number of elements. 


(b) If H is a subgroup and we define ~, on the set G as a~ b whenever ab™! € H, then ~, is 
an equivalence relation. 


(c) If H is a subgroup and we define ~; on the set G as a ~~ b whenever a~'b € H, then ~, is 
an equivalence relation. 


At this point, you probably do not have a great deal of experience proving facts about groups. 
Therefore, at the risk of being repetitive, we again point out that the only properties we are 
allowed to use are those dealing with associativity, the identity, and inverses. In particular, we 
cannot assume that multiplication is commutative. 


Proof: For part (a), we will first show that M and Ma have the same number of elements. It 
will be enough to find a bijection from M to Ma. Since Ma is defined as {ma |m € M}, it 
seems natural to consider the function 


az:M— Ma, 
where 

m(m) = ma, 
for allm eM. 


It is easy to see that z is surjective. Indeed, if y € Ma then, by the definition of Ma, y = xa, for 
some x € M. Thus, 


y=xa= (x) 
and y belongs to the range of z. Hence, z is surjective. 


To prove that z is injective, suppose g, h € M such that z(g) = (h); we need to show that 
g =h. Observe that since z(g) = m(h), we have 


ga =ha. 


298 Chapter 8 
Multiplying this equation on the right by a~! yields 


(ga)a~' = (ha)a™', 


then 
g(aa~') = h(aa~'), 
followed by 
ge=he, 
and finally 
gah. 


Since z is both injective and surjective, it is a bijection, so M and Ma have the same number 
of elements. Not surprisingly, to show that M and aM have the same number of elements, 
we consider the function 


p:M-— aM, 
where 
p(m) =am, 


for all m € M. It will be good practice for you to work through the details and verify that p is 
also a bijection. 


Before proving part (b), you should observe that this is a generalization of the equivalence 
relation used in Chapter 7, where we defined Z,, using the group Z and the subgroup nZ. First, 


to show that ~, is reflexive, we need to show that a ~, a, for all a € G. If a € G, we have 


aa~'! =e. Since e € H, we see that a ~, a and so, ~, is reflexive. 


Next, suppose a, b € G such that a ~, b. To show that ~; is symmetric, we need to show that 
b ~, a, 80 we may assume that ab~! € H, and we must verify that ba~! € H. Observe that 


(ba~')(ab~!) = b(a~!a)b“! = beb“! =e 
and 
(ab~')(ba~!) = a(b~'b)a! = aea~! =. 


These two equations tell us that ba! is the inverse of ab~!. However, since H is a group, it 
contains the inverse of ab~!. Thus, ba~! € H and ~, is symmetric. 


Group Theory 299 


For the final piece of part (b), suppose a, b, c € G such that a ~; b and b ~, c. To show that ~, 
is transitive, we need to show that a ~, c. Therefore, we may assume that ab—!, bc—! € H, and 
we need to show that ac~! € H. Since H is a group, it contains the product of ab~! and bc7!. 

Thus, 


(ab~!)(bce™!) = a(b~!b)c7! = aec™! = ac", 
so ac! € H, as desired. 


The proof of part (c) is very similar to and uses the same ideas as the proof of part (b). 
Therefore, we will leave the proof to you. Providing all the details of the proof should 
reinforce your understanding of the proof of part (b). 


In the proof of Lemma 8.17(b), we showed that the inverse of ab~' is ba~!. Observe that this 
argument is similar to the one in the proof of Proposition 7.7 where we showed that the inverse 
of ab is b~'!a~'. We can now state the first significant theorem about the structure of finite 
groups. Observe that it answers the first question we posed in this section in the affirmative. 


Theorem 8.18—Lagrange’s Theorem. /f G is a finite group and H is a subgroup of G, then 
|H| divides |G|. 


Before proving Lagrange’s Theorem, it is important to understand what it says and what it 
does not say. Note that it does not assert the existence of subgroups of G. It does quite the 
opposite, as it places limits on the orders of subgroups that can exist. 


For example, since |.S3| = 6, Lagrange’s Theorem tells us that every subgroup of $3 must have 
order 1,2, 3, or 6. But it does not guarantee that subgroups of these orders exist. In fact, what 
it really tells us is that $3 does not have any subgroups of order 4 or 5. 


Lagrange’s Theorem can also simplify the computations involved in finding various 
subgroups, such as centralizers. Let us consider the fifth dihedral group Ds. Every element of 
Ds can be written in the form yx, where 0<i1<4,0<j<1, y = x* =e, and xy= yx, 
Since |Ds| = 10, Lagrange’s Theorem tells us that the only possible subgroups of D5 have 
order 1, 2,5, or 10. 


Suppose we wanted to compute C(y). Since C(y) certainly contains < y > and o(y) = 5, we 
know that |C(y)| => 5. Since y does not commute with x, we also know that |C(y)| 4 10. 
Therefore, Lagrange’s Theorem tells us that |C(y)| = 5. As a result, without having to do much 
computing, we see that 


Cy) = Sy SS ley iy ee: 


The proof of Lagrange’s Theorem will once again indicate how useful equivalence relations 
and equivalence classes are in higher mathematics. 


300 Chapter 8 


Proof of Lagrange’s Theorem. Let ~, be the equivalence relation on the set G defined in 
Lemma 8.17(b). Since every equivalence class is a subset of G, and G only has a finite number 
of subsets, there is only a finite number of equivalence classes, and we will call them 

Aj, A2,..., Am. Recall that an equivalence class may have many names, but our list of 
equivalence classes will include each class only once. 


Proposition 4.13 tells us that every element of G belongs to exactly one equivalence class, so 
|G| = |Ai|+|A2|+---+]Anl- 


At this point, the question is how many elements belong to each class? 


For every b € G, let [b] denote the equivalence class containing b. If a € [b], then the 
definition of ~, tells us that there exists h € H such that ab~' = h. This implies that a = hb, 
so a € Hb. As aresult, [b] Cc Hb. 


On the other hand, if a € Hb, then there exists h € H such that a = hb. This immediately tells 
us that ab-! =h € H,soa~,b anda € [b]. Thus, Hb C [b]. 


We can now see that for every b € G, [b] = Hb. Lemma 8.17(a) told us that every set of the 
form Hb has the same number of elements as H. Therefore, every equivalence class has 
the same number of elements as H. The equation |G| = |Ai|+|A2|+---+|Am| now becomes 


|G|=|H|+|A|+---+|/H|=m|H]. 


Since |G| = m|H |, |H| does indeed divide |G]. 


The proof of Lagrange’s Theorem used ~, and sets of the form Hb from Lemma 8.17(b), but 
we could have equally well used ~; and sets of the form bH from Lemma 8.17(c). Since 
sets of the form Hb and bH are so important, we have 


Definition 8.19. Jf H is a subgroup of a group G and b € H, we call sets of the form Hb right 
cosets of H and sets of the form bH left cosets of H. 


When G is finite with subgroup H and |G| = m|H|, we call m the index of H in G. Thus, m 
tells us both the number of right cosets and left cosets corresponding to H. Let us now take a 
look at some examples of right and left cosets. 


Examples 


|. Let G = Z4; observe that in this example we will be using additive and not 
multiplicative notation. 


Group Theory 301 
1. If H = {[O]4}, then |H| = 1 and m=4. Therefore, H has four right cosets and four 
left cosets, each with only one element. 
A +[0]4 = {[0]4} = [0]4+ 4, 
A+(1]4 = {((1]4} =(1]4+ A, 
A+[2]4 = {[2]4} = [214+ A, 
A+([3)4 = {([3]4} = 3l4+2. 


2. If H = {[0]4, [2]4}, then |H| = 2 and m = 2. Therefore, H has two right cosets and 
two left cosets, each with two elements. Observe that each right coset and each left 
coset now have two different names, as they are equivalence classes with two 
elements. 


A+[0]4= H+[2]4 = {10]4, [(2]4} = (214+ 4 = [04+ H, 
A+(1)4=4+([3)4 = {01l4, Bla} =Bu+H=(1)4+a4. 


3. If His all of Z4, then |H| =4 and m = 1. Therefore, H has only one right coset and 
only one left coset, and they consist of all four elements of Z4. Therefore, each right 
coset and each left coset have four different names. 


A+(0]4=H+(1]4=H+4+[2]4= H+([3]4 = {[O]a, (1a, [2]a, [(3]4} = 
[34+ 4 =[2]4+H =[1]4+H =[0]44+-. 


Il. Let G = $3; in this example things get much more interesting. Throughout this 
example and this section, we will use the notation for the six elements in S3 developed in 
the previous section. 


1. If H = {e}, then |H| = 1 and m =6. Therefore, there are six right and six left cosets, 
each with one element. 


He={e}=eH, Hf={f}=fH, Hg={g}=eH, 
Hh={h}=hH, Hj={j}=jH, Hk={k}=kH. 


2. If H={e, f}, then |H| = 2 and m=3. Now there are three right and three left 
cosets, each with two elements. But now when we look at the elements in the right 
and left cosets, we notice that something new is happening. 


He = Hf = {e, f}, 
Hg = Hj = {g, Jj}, 
Hh = Hk = {h,k}, 


www.elsevierdirect.com 


302 


Chapter 8 


whereas 
eH = fH = {e, f}, 
gH =kH ={g,k}, 
hH = jH = fh, j}. 


Note that Hg 4 gH, Hj 4 jH, Hh AhH, and Hk £kH. For the first time, we see 
that for some be G, the right and left cosets Hb and bH need not be the same 
subset of G. 


If H = {e, g}, then |H| =2 and m =3. There are again three right and three left 
cosets, each with two elements. 


He = Hg = {e, 8}, 

Hf = Hk= {fk}, 

Hh = Hj = {h, j}, 
whereas 

eH = gH ={e, g}, 

f= jH=tf ji: 

hH=kH ={h,K}. 
Observe that Hf 4 fH, Hk AkH, Hh AhH, and Hj ¥ jH. 


If H = {e, h}, then |H| = 2 and m = 3. Again we have three right and three left 
cosets, each with two elements. 


He = Hh = {e, h}, 

Hf = Hj=(h 

Hg = Hk = {g, k}, 
whereas 

eH =hH = {e,h}, 

gH = jH={g, j}, 

fH =kH ={fk}. 


In this example, Hf A fH, Hj A jH, Hg A gH, and HkFAkH. 


www-.elsevierdirect.com 


Group Theory 303 


5. If H = {e, j,k}, then |H| =3 and m =2. Therefore, there will be two right cosets and 
two left cosets, each with three elements. 


He = Hj = Hk = {e, j,k}, 
Hf = Hg = Hh=(f, g, hj, 
and 
eH = jH=kH = {e, j,k}, 
fH = gH=hH=({f g,h}. 
Note that we are back in the situation where, for everybe G, Hb=bH. 


6. If H = $3, then |H| =6 and m= 1. Thus, every right coset and every left coset is 
equal to all of S3. 


We have seen that there exist subgroups H of groups G where Hg 4 gH, for some g € G. This 
raises the question of whether it is important for a subgroup H to have the property that 
Hg = gH, forall ge G? 


To answer this question, we will look back at the construction of Z, from Z. We began with Z 
and some n > 2 and then defined ~, on Z as a ~,, b whenever a — b was a multiple of n. 
Rephrasing this in the language of groups, Z is a group under addition with subgroup H = nZ. 
Then, since we are dealing with groups under addition, ~, is the same as the equivalence 
relation in Lemma 8.17(b) in which a ~; b whenever a—b € H. Thus, the elements of Z, are 
the left and right cosets corresponding to the subgroup nZ. In Chapter 7, we showed that 
these cosets formed a group under addition. The key step was to first show that the addition 
of cosets was, in this situation, well defined. More generally, given a subgroup of a group, we 
can ask whether the left or right cosets form a group. In order to do this, if we are using 
multiplicative notation, we first need to see whether or not the multiplication of cosets is well 
defined. As we will soon see, the property that Hg = gH, for all g € G, is exactly what is 
needed to make this happen. 


Definition 8.20. Let G be a group with subgroup H. We say that H is normal if Hg = gH, for 
all g EG. 


Needless to say, if a mathematical property is called “normal,” then it is considered a very 
good property to have. If G is abelian, then it is easy to see that all subgroups are normal. This 
was certainly the case when we constructed Z, using the group Z and subgroup nZ. However, 
for nonabelian groups, it can be the case that some subgroups are normal and others are not. 


304 Chapter 8 


If we look back at our work with $3, we can see that none of the three subgroups {e, f}, {e, g}, 
and {e, h} are normal. On the other hand, the three subgroups {e}, {e, j,k}, and $3 are all 
normal. 


It is easy to see that for any group G, the subgroups G and {e} are always normal. It is also not 
hard to see that any subgroup of G that is contained in Z(G) must also be normal. Let us now 
begin to investigate the relationship between subgroups being normal and the multiplication of 
cosets. 


m Example 


Let G = $3 and let H = {e, f}. If we wanted to multiply right cosets, the most natural 
thing to do would be to define 


Ha. Hb= Hab), 


for all a, b € G. Therefore, the way we multiply cosets is to multiply their names. The 
problem is that cosets can have many different names, and we need to see if changing the 
name of the cosets changes the answer when we multiply. 


In this situation, we have Hg = Hg and Hg = Hj; therefore, if the multiplication of right 
cosets is well defined, then the products Hg- Hg and Hg- Hj would need to be the same. 
However, 


Hg - Hg = H(gg) = He, 
whereas 
Hg. Hj = H(gj) = Ah. 


Since He ¢ Hh, changing the names of the cosets did indeed change the answer when 
multiplying. Hence, in this case, the multiplication of cosets is not well defined. 


In the previous example, H was not a normal subgroup of G. We will now show that being 
normal is exactly what is needed for the multiplication of cosets to be well defined. 


Theorem 8.21. Let G be a group with subgroup H and define the multiplication of right 
cosets as Hg- Hh = H(gh), for all g,h € G. Then multiplication of right cosets is well defined 
if and only if H is normal. 


Proof. In one direction, suppose that H is normal. In order to show that multiplication of right 
cosets is well defined, we need to show that if a, b,c, d € G such that Ha = Hb and Hc = Ha, 


www-.elsevierdirect.com 


Group Theory 305 
then Ha- Hc = Hb. Hd. Since 
Ha-Hc= H(ac) and Hb-Hd= H(bd), 


we need to show that H(ac) = H(bd). 


Since a, b belong to the same right coset, ab~! € H. Similarly, since c, d belong to the same 
right coset, cd—'! € H. Therefore, 


(1) ab-'=h and cd-'=k, 


where h, k € H. Since the inverse of bd is d~'b™!, in order to show that H(ac) = H(bd), we 
need to show that 


(2) (ac)(bd)~'! =acd~'b"! € H. 
In light of equation (1), a = hb and c = kd, so equation (2) becomes 
(3) (ac)(bd)~' = acd~'b~' = (hb)(kd)d~'b~' = hbkb™'. 


Clearly bk € bH. However, since H is normal, bH = Hb and so, bk € Hb. As a result, there 
exist / € H such that bk = /b. Equation (3) now becomes 


(ac)(bd)~' = h(bk)b7! = h(lb)b! = (hl) (bb™!) = hl € H. 


Thus, H(ac) = H(bd) and the multiplication of right cosets is well defined. 


In the other direction, we will assume that the multiplication of right cosets is well defined, 
and we will show that H is normal. Therefore, if a € G, we need to show that Ha = aH. It will 
suffice to show that if h € H, then ha € aH and ah € Ha, for these combine to tell us that 

Ha C aH and aH C Ha. 


Since h € H, we know that He = Hh. Using the fact that coset multiplication is well defined, 
we have 


Ha- Hh = Ha: He. 


Since Ha- Hh = H(ah) and Ha- He = H(ae) = Ha, we know that H(ah) = Ha. As a result, 
(ah)a~' € H and so, there exists k € H such that aha~! = k € H. This immediately implies 
that ah = ka € Ha. 


Using an argument similar to the preceding one, 


Ha~'- Hh = Ha! - He. 


306 Chapter 8 


However, Ha~!- Hh = H(a~'h) and Ha~! . He = H(a~'e) = Ha™'. This tells us that the right 
cosets H(a~'h) and Ha™! are the same, so 


a "hia 1) ‘La hae H. 


As a result, there exists k € H such that a~'ha = k. This immediately implies that ha = ak, so 
ha € aH, thereby concluding the proof. 


We can now generalize the construction of Z, from Z and nZ that we saw in Chapter 7. 


Corollary 8.22. Let G be a group with normal subgroup H. The set of right cosets is a group 
where multiplication is defined as Ha. Hb = H(ab), for all a,b € G. 


Proof. Since G is closed under multiplication, it is easy to see that the set of right cosets is 
closed under multiplication. Furthermore, Theorem 8.21 asserts that the multiplication of 
cosets is well defined. As a result, it remains to verify that the multiplication of cosets is 
associative, that it has an identity element, and that each coset has an inverse. We will soon see 
that the set of right cosets inherits the desired properties from G in a very straightforward way. 


Since G is associative, if a, b,c € G, then (ab)c = a(bc), which implies that 
(Ha- Hb)- Hc = H(ab)- Hc = H((ab)c) = H(a(bc)) = Ha- H(bc) = Ha- (Ab: He). 


Therefore, the multiplication of cosets is associative. 


When we looked at Z,,, the equivalence class [0], was the identity element. In our more 
general situation, we would expect the coset containing the identity element of G to be the 
identity element when multiplying cosets. Indeed, if a € G, since ae = a = ea, we have 


Ha- He = H(ae) = Ha = H(ea) = He Ha. 


Thus, He, which is another name for H, is the identity element of coset multiplication. 


Finally, we need to show that every coset has an inverse. Once again, we look at Z,, for some 
direction. The inverse of the class containing a € Z was the class containing —a. Moving to 
the more general case and switching to multiplicative notation, we would then expect that the 
inverse of the coset containing a € G be the coset containing a~! € G. Checking that this is 
indeed the case, since a-a~! = e =a7!-a, we have 
Ha- Ha! = H(a-a"') = He= H(a"'a) = Ha! . Ha. 


Hence, Ha™! is the inverse of Ha, and we have succeeded in showing that the set of right 
cosets is a group. 


Group Theory 307 


Groups of the type described in Corollary 8.22 are an important part of group theory, and we 
will return to them in greater detail in the next section. But now that we know what normal 
subgroups are, we can ask, how do we find normal subgroups? Certainly, if G is abelian, all 
subgroups are normal. Furthermore, for any group G, the subgroups {e} and G can easily be 
seen to be normal. However, to find other normal subgroups, we look at functions similar to 
isomorphisms. 


Definition 8.23. If G,, G2 are groups, then a function ¢ : G; — G2 is called a 
homomorphism of groups if ¢(gh) = $(g)¢(h), for all g,h € G. 


Observe that homomorphisms satisfy the same defining equation as isomorphisms. In fact, 
every isomorphism is a homomorphism. However, there are homomorphisms that are not 
isomorphisms, as a homomorphism can fail to be injective, surjective, or both. We can now 
look at the connection between homomorphisms and normal subgroups. 


Theorem 8.24. Let 6: Gj — G2 be a homomorphism of groups. 
(a) If e,, e2 are, respectively, the identity elements of G, and G2, then $(e1) = e2. 
(b) If g € Gi, then o(g7') = G(g)". 


(c) Ifwe let Ker(p) = {g € Gi | $(g) = eo} and Im(¢) = {(g) | g € Gi}, then Ker(¢) is a 
subgroup of G, and Im(@) is a subgroup of Go. 


(d) Ker(@) is anormal subgroup of G. 


Proof. For part (a), we know that e; = e; -e; and applying ¢ to this equation yields 


p(e1) = Pei -e1) = H(e1) - b(e1). 


Multiplying the terms at the left and right of the previous equation by #(e1)~! on the left 
gives us 


(er) ' -H(e1) = Ple1)'G(e1) - H(e1), 
which, using associativity, yields 
er = er -(e1), 
and finally 
er = (ei). 
1 


For part (b), if g € Gy, then e; = g-g” 
eo = $(e1) = $(g-g°') = 6(g)-G(g7'). 


and applying ¢ and using (a) gives us 


308 Chapter 8 


Multiplying the terms at the far left and right of the previous equation by 6(g)~! on the left 
tells us that 


$(g) | +e =$(g) |: G(g)-(8"'), 
which, using associativity, simplifies to 
$(g) | =e0-(g"'), 
and then 
¢(g) '=9(g"'). 


For the first half of part (c), to show that Ker(@) is a subgroup of G1, we again use 
Proposition 8.6(a). Therefore, if x, y € Ker(@), we need to show that xy! € Ker(p). In light 
of (a) and (b), we know that 


g(y')= OQ)! Sey) =e. 
Therefore, 
(x+y!) = OC) -G(y™') = 2-2 = er. 
Hence, xy~! € Ker(@), as desired. 


Next, to show that /m(@) is a subgroup, we once again use Proposition 8.6(a). Therefore, if 
x, y € Im(#), we must show that xy! € Im(@). We know that there exist a, b € G; such that 
x = $(a) and y= ¢(b). Using (b), we now have 


x-y | =9(a)-¢(b) | =¢(@)-6(b"') =¢(ab"'). 
Thus, x- y~! € Im(@) as required. 


For part (d), we need to show that if g € G,, then gKer(¢) = Ker(@¢)g. We will do this by 
showing that for every h € Ker(@), we have gh € Ker(@)g and hg € gKer(@). In order to do 
this, we look at the elements ghg—!, g~'hg and observe that 


o(ghg') = 0(g)-b(h)- $(g') = 6(8)-e2-¢(g |) = 
$(g)-o(g |) = $(gg') = b(e1) = e2 
and 
o(g hg) = o(g"')-o(h)- $(g) = O(g |) -e2- P(g) = 
¢(g')-6(g) = $(g |g) = b(e1) =e. 


Group Theory 309 


-1 


The preceding equations tell us that both ghg—! and g~'hg belong to Ker(#). This enables us 


to express gh, hg as 


gh= (ghg') géKer(d)g and hg= e(g ‘hg) € gKer(@). 


As a result, it is indeed the case that gh € Ker(¢)g and hg € gKer(@). 


Homomorphisms have uses far beyond being used to produce normal subgroups. For example, 
suppose @: G; — G2 is ahomomorphism of groups and suppose a, b € G; commute when 
multiplied. Since a-b = b-a, we have 


b(a)- b(b) = o(a- b) = (b- a) = 6b) -4@). 


Therefore, we have shown that since a and b commute in Gj, then ¢(a) and @(b) must 
commute in G2. You should convince yourself that this tells us that if G; is abelian and ¢ is 
surjective, then G2 must also be abelian. More generally, a homomorphism ¢ : G; > G2 
provides us with a link between algebraic properties of G; and those of G2. 


For another example, suppose ¢ : G; — G2 is ahomomorphism of groups that is also injective. 
Also suppose that G» has the property that there is a fixed positive integer m such that h” = h, 
for all h € G2. We claim that G; must also have this property. First observe that if g € G, then 
o(g) € Go, so b(g)” = o(g). However, since ¢ is a homomorphism, the identical argument 
used in the proof of Lemma 5.9 shows us that ¢(g”) = $(g)”. As a result, 6(¢’") = (g). But 
since @ is injective, this tells us that g’” = g. Thus, G; does inherit this property from Go. 


m Examples 


Homomorphisms of Groups 
1. Ifn >2, then Z and Z, are both groups under addition. Let 


d:Z2—>Z, 
be defined as 
$(a) =[a]n, 
for all a € Z. Observe that 
p(at+b) = [a+ bn = [an +1bln = $+), 


for all a,b € Z. Thus, ¢ is a homomorphism. In this example, Ker(¢) = nZ, which is 


a normal subgroup of Z, and Im(@) = Z,, which Is certainly a subgroup of Zp. 


310 Chapter 8 


2. IR* isa group under multiplication. Let 
o:R* > R* 


be defined as o(x) = x”, for all x € R*. Since R* is abelian, we have 


b(xy) = (xy)? = xyxy = x’y’ = o(x)O(y). 


Thus, @ is ahomomorphism. In this case, Ker(@) = {1, —1}, which is a normal 
subgroup of R*. Furthermore, Im/(@) is the set of positive real numbers, which is also 
a subgroup of R*. 


3. Let G be any abelian group and let n € Z. Define 
o:G>G 
as b(g) = g", forall g € G. Since G is abelian, we have 


b(gh) = (gh)" = g"h" = o(g)o(h), 


so @ isa homomorphism. Since Ker(¢) = {g € G| g" =e} and Im(¢) = {g" | g € G}, 
Theorem 8.24(c) tells us that the sets {g € G | 9” =e} and {g” | g € G} are always 
subgroups of an abelian group. 


4. Zisa group under addition and U(Q) is a group under multiplication. Let 
¢:Z2> U® 
be defined as @(n) = (—1)", for alln € Z. Then @ is a homomorphism as 
o(n+m) = (-1)"" = (-)"(-D" = $(n) -b(m), 


for alln,m € Z. Observe that (n) = 1 ifn is even and @(n) = —1 if n is odd. It is 
easy to see that Ker(@) = 2Z and Im(@) = {1, —1}. 


5. Zisa group under addition and U(C) is a group under multiplication. Let 
o:Z—> U(C) 
be defined as @(n) =i", for all n € Z. Then @ is a homomorphism as 
o(n+m) =i" =i"-i" = o(n)-d(m), 


for alln,m € Z. Observe that d(n) = 1 ifn =0 (mod 4), d(n) =i ifn = 1 (mod 4), 
o(n) =—1 ifn =2 (mod 4), and ¢(n) = —i ifn =3 (mod 4). We now have 
Ker(¢) = 4Z and Im(@¢) = {1, —1, i, —1}. 


www-.elsevierdirect.com 


Group Theory 311 


6. IR* isa group under multiplication. Let 
¢:R* > R* 


be defined as @(x) = |x|, for all x € R*, where |x| denotes the absolute value of x. 
Then ¢ is a homomorphism as 


o(x- y) =|x- yl = |xl-lyl = ¢@)- 60), 


for all x, y € IR*. As in our second example, Ker(¢) = {1, —1} and Im(@) is the set of 
positive real numbers. 


7. IR* isa group under multiplication and R is a group under addition. Let 
@:R'*>R 


be defined as $(x) = In(x), for all x € Rt, where In(x) denotes the logarithm of x 
using the base e. Using a basic property of log functions, it follows that @ is a 
homomorphism as 


p(x: y) =In(@w- y) = In(w) + In(y) = $(%) + 60), 
for all x, y € R*. In this example, Ker(#) = {1} and Im(¢) =R. 
8. Zisa group under addition. Let 
@:Z2->2 
be defined as @(n) = 10n, for all n € Z. Then ¢@ is a homomorphism as 
o(n+m) = 10(n+m) = 10n+ 10m = O(n) + d(m), 


for alln,m € Z. In this example, Ker(@) = {0} and Im(¢) = 10Z. 
_ a) 


We now turn our attention to the second question we posed earlier in this section. Namely, if G 
is a finite group and n € N divides |G|, must G have a subgroup with n elements? For the 
examples we looked at earlier, such as $3 and D4, the answer was yes. In fact, although it is 
not obvious at first glance, the answer turns out to be yes for all abelian groups. However, 

in general, the answer turns out to be no. The smallest example that shows that the answer is 
no is a group with 12 elements that does not contain a subgroup with 6 elements. We will take 
a detailed look at this example in Section 8.4 when we examine symmetric groups. 


What does turn out to be true is that if G is a finite group and p“ divides |G|, where p is a 
prime, then G must contain a subgroup with p“ elements. This result is known as Sylow’s 


www.elsevierdirect.com 


312 Chapter 8 


Theorem, and at the beginning of this chapter, we described it as the crown jewel of an 
introductory course in group theory. It has countless applications for group theorists in their 
attempts to understand the structure of finite groups. Even though this is a fundamental result 
that provides deep insights into finite group theory, it does not require advanced techniques to 
prove. In fact, we will be able to prove it after three relatively short lemmas. 


As we attempt to find a subgroup of G with p* elements, we will need to derive some facts 
about the number of subsets of G with p“ elements. Recall that if a set G has n elements and 
1 <t<vn, then the number of subsets of G with exactly t elements is 


n\ _ n!} 
(") ~ (n—d!-t! 


To motivate our next lemma, let us look at some examples, 


1 24 
24=2°-12 and 51 = 276, 
> 24 
24=2°-6 and 72 = 10626, 


24 
24=2°-3 and Ss) = 735471. 


Admittedly, it is hard to find a pattern in these examples. In an attempt to find a pattern, 

we note that in all three cases, we can write 24 as 24 = 2“m. Observe that in all three cases, the 
largest power of 2 that divides (3) is the same as the largest power of 2 that divides m. For 
example, when we write 24 = 2? -6, then the largest power of 2 that divides 6 and the largest 
power of 2 that divides 10626 is 2!. All of this is merely a special case of 


Lemma 8.25. Suppose n = p“m, where p is a prime, a> 0, andm €N. If p? is the largest 


power of p that divides m, then p? is also the largest power of p that divides Ca) 


. n “m! 
Proof. Since () = n=O we have 


(2S Ee 
p (p*)(p? — 1)(p4 —2)---() 

x (a 1) Cpe 2) (ptm = (pe >) 

(pt =1)(pt =2)9(D) 


Therefore, it suffices to show that when we look at the fraction 


(o —1)(ptm — 2)--- (ptm — (pt — >) 
(pt = 1)(p? = 2) +?) 


Group Theory 313 


the power of p that occurs in the prime factorizations of both the numerator and the 
denominator is the same. 


At first, this looks like it might involve some difficult computations. However, things will not 
be too difficult because we will be able to do this one term at a time. More precisely, for any 
1 < j < p*—1, we will show p’ divides p“ — j if and only if it divides p“m — j. This will 
suffice as the power of p that divides the numerator is the product of all the powers of p that 
divide terms of the form p“m — j, and the power of p that divides the denominator is the 
product of all the powers of p that divide terms of the form p* — j. 


In one direction, suppose p' divides p“ — j. Since p’ < p“— j < p%, it is clear that t < a. 
Therefore, p' divides p“ — j, p“, and p“m, so p' divides j = (p“) — (p% — j). But this 
immediately implies that p’ also divides p“m — j. 


In the other direction, suppose p’ divides p“m — j. Observe that if t > a, then p“ divides both 
pm — j and p“, which would imply that p* implies j = p* — (p*m — j). But this contradicts 
the fact that 1 < j < p*—1. Thus, t < a, which implies that p’ divides p“, p“m, and p“m — j. 
As aresult, p’ divides j = p“m — (p“m — j). Since p‘ divides both p“ and j, p' divides p“ — j, 
as desired. 


In light of the preceding argument, p? divides (a) if and only if p’ divides m. 


The second lemma we need to prove Sylow’s Theorem is 


Lemma 8.26. Let G be a group with subset M. 
(a) Ifa,be€G, then (Ma)b = M(ab). 


(b) Ifn &€N and if we define ~ on the set of subsets of G with n elements as M ~ N 
whenever there exists a € G such that M = Na, then ~ is an equivalence relation. 


Proof. For part (a), we know that M(ab) = {m(ab) | m € M}. On the other hand, 

Ma = {ma|m € M} and (Ma)b = {nb | n € Ma}, which combine to imply that 

(Ma)b = {(ma)b | m € M}. The associative law tells us that m(ab) = (ma)b, so the sets M(ab) 
and (Ma)b consist of the same elements of G. Thus, (Ma)b = M(ab). 


For part (b), we need to show that ~ is reflexive, symmetric, and transitive. Observe that if 
N CG, with |N| =n, then N = Ne. Therefore, N ~ N and ~ is reflexive. Next, to show that 
~ is symmetric, suppose we are given M, N C G such that |M| =|N| =n and M ~ N; we 
need to show that N ~ M. By the definition of ~, M = Na, for some a € G. In light of 

part (a), when we multiply this equality of sets on the right by a~!, we obtain 


Ma™! = (Na)a~! = N(aa~!) = Ne=N. 


Therefore, N = Ma™!, which tells us that N ~ M, and so ~ is symmetric. 


314 Chapter 8 


Finally, to show that ~ is transitive, suppose we are given M, N, P C G such that 

|M| = |N| =|P| =n, M ~ N, and N ~ P; we need to show that M ~ P. We know there exist 
a,b e€G such that M = Na and N = Pb, and we need to find some c € G such that M = Pc. If 
we multiply the equality N = Pb on the right by a and apply (a), we obtain 


Na = (Pb)a = P(ba). 


Combining this with the fact that M = Na, if we let c = ba, we can see that M = P(ba) = Pc. 
Hence, M ~ P. Thus, ~ is also transitive and is therefore an equivalence relation. 


The final piece of the puzzle we need before proving Sylow’s Theorem is 


Lemma 8.27. Let M be a subset of a group G and let H = {g € G| Mg = M}. 

(a) A is a subgroup of G. 

(b) Ifa,b eG then Ha = Hb if and only if Ma = Mb. 

(c) If M is finite, then |M| > |H|. 

Proof. For part (a), Proposition 8.6(a) asserts that to show H is a subgroup, it suffices to show 


that if x, y ¢ H, then xy! € H. This means that we must verify that M(xy~') = M. Using 
Lemma 8.26(a), we have 


M = Me = M(yy™') = (My)y"! = My". 
The preceding equation, along with an additional application of Lemma 8.26(a), tells us that 
M(xy~') = (Mx)y"! = My = M, 
so xy! eH. 


For part (b), suppose a, b € G such that Ha = Hb. As we saw in the proof of Lagrange’s 
Theorem, a and b belong to the same equivalence class under the equivalence relation ~, that 
was defined in Lemma 8.17(b). Therefore, ab~' = h, for some h € H. This implies that 

a = hb, which, using Lemma 8.26(a), implies that 


Ma = M(hb) = (Mh)b = Mb. 


In the other direction, suppose a, b € G such that Ma = Mb. Multiplying this on the right 
by b~! and using Lemma 8.26(a), we obtain 


Mab!) = (Ma)b7! = (Mb)b"! = M(bb~') = Me = M. 


But this means that ab~! € H, which implies that a ~, b under the equivalence relation just 
mentioned. Since Ha is the equivalence class of a and Hb is the equivalence class of b, we see 
that Ha = Hb, as desired. 


Group Theory 315 


For part (c), if m € M, consider the set mH. In light of Lemma 8.17(a), |mH| = |H|. On the 
other hand, by the definition of H, every element of mH belongs to M. Since mH is a subset 
of M, we have |M| > |mH| = |H|, as desired. 


We can now prove the main result of this section. 


Theorem 8.28—Sylow’s Theorem. Let G be a finite group and suppose p divides |G\, 
where p is a prime. Then G contains a subgroup with p“ elements. 


Proof: We know that G has many subsets with exactly p“ elements; in fact, if we let n = |G|, 
then we know that G has ( oA of them. The question is, how do we produce a subgroup with p“ 
elements? Once again, equivalence relations and equivalence classes play a key role. 


Since p“ divides n, we can write n = p“m, for some m € N. Furthermore, we can let p?” 
denote the largest power of p that divides m, keeping in mind that it is possible that p’ = 1. 
As shown in Lemma 8.25, p? is also the largest power of p that divides ia 


Let T denote all subsets of G with p* elements and, in light of Lemma 8.26(b), we can 

let ~ be the equivalence relation on T defined as M ~ N, whenever there exists a € G such 
that M = Na. Since T is finite, there are only a finite number of equivalence classes, and we 
can let A;, Az,...A; denote these classes. Remember that each class may have many different 
names, but each element of T belongs to only one equivalence class. Therefore, 


|T| = |Ai|+]A2]+---+1Adl. 


Since |T| = (r)> we know that p? divides |T|, but p?+! does not divide |T|. In light of 

the previous equation, we can see that there must be at least one equivalence class where p?*! 
does not divide the number of elements in it. Therefore, there exists some M e€ T such that 
p?*! does not divide the number of elements in the equivalence class containing M. 


What does [M], the equivalence class containing M, look like? By the definition of ~, 
N €[M] whenever N = Ma, for some a € G. Therefore, 


[M] = {Ma,, Map,..., Maj}, 


for various a; € G. 


We can now ask, how many different sets belong to [M]? Remember that each element of 
[M] can have many different names, but we are really asking the question, how many different 
sets of the form Mg are there as g ranges through the elements of G? 


If we let H = {g € G| Mg = M}, then Lemma 8.27(a) told us that H is a subgroup of G. 
Furthermore, Lemma 8.27(b) asserted that if a, b € G, then Ma = Mbif and only if Ha = Hb. 
This means that the number of sets of the form Mg, as g ranges through G, is the same as the 


316 Chapter 8 


number of sets of the form Hg, as g ranges through G. However, we already called this 


number the index of H in G and saw, since G is finite, that it is equal to Vr 


We can now determine the size of the subgroup H. Our previous work indicated that 
|G| =1-|H|, 


where / is not only the index of H in G but is also the number of elements in [M]. Since 
|G| =n = p“m and p? divides m, we know that p“+? divides |G|. On the other hand, we 
have already seen that p’+! does not divide J. Therefore, the only way that p*+? 
l-|H| is if p® divides |H|. 


can divide 


Since p“ divides |H| and H has at least one element, we now know that 
|H| > p*. 

On the other hand, Lemma 8.27(c) indicated that |H| < |M|, and we know |M| = p*, so 
|H| < p*. 


Combining the preceding observations, we can see that H is a subgroup of G that contains 
exactly p“ elements, as desired. 


We will conclude this section by briefly examining some of the things that Lagrange’s Theorem 
and Sylow’s Theorem tell us about finite groups. But first, we need some terminology. 


If G,, G2 are groups, we define the direct product 
G| Xx G2 = {(a, b) Jae G1,beE Gp}. 


It is not hard to check that G; x G2 is also a group. Observe that if e; and e2 are, respectively, 
the identity elements of G; and Go, then (e1, ez) is the identity element of G,; x G2. 
Furthermore, if a € G, and b € Go, then (a~!, b~') is the inverse in G, x G» of (a, b). More 
generally, if G1, Go,..., G, are groups we can form the direct product 


G, X G2 x-:-x Gn = {(a1, a2,..., An) | G € Gi, for 1 <i<n}. 
Next, suppose G is cyclic and |G| =n, where n > 1. If g € G such that < g > = G, then 
G = {e, g, g7,..., 97 |}. 


Observe that this tells us that any two finite cyclic groups with the same number of elements 
are isomorphic. Indeed, if H is another cyclic group with n elements and H = <h >, then the 
function 


@:G-H 


Group Theory 317 


defined as o(g') =—h', for0 <i<n-—1,isan isomorphism. We can now use Lagrange’s 
Theorem to prove 


Corollary 8.29. Every group with p elements, where p is a prime, is cyclic. Therefore, up to 
isomorphism, there is only one group of order p. 


Proof. If G is a group with p elements, where p prime, then G contains at least one element 
that is not the identity element e. Therefore, if g € G such that g #e, then | <g> |>2. 
However, by Lagrange’s Theorem, | < g > | must divide the prime number p. Hence, 
|<g>|=p,so < g> isall of G, so Gis cyclic. 


Since all finite cyclic groups of the same order are isomorphic and all groups of order p are 


cyclic, it follows that, up to isomorphism, there is only one group of order p. 


Earlier in this section, we remarked that there exists a group with 12 elements that does not 
contain a subgroup with 6 elements. However, we can now use Sylow’s Theorem to see that 
any group with fewer than 12 elements has the property that if n divides |G|, then G has a 
subgroup with n elements. 


Corollary 8.30. Let G be a group such that |G| < 11. Ifn € N such that n divides |G|, then G 
contains a subgroup of order n. 


Proof. \f |G| = p” and n divides |G|, where p is prime, then n = p“, where 0 < a < m. This 
certainly occurs when |G| = 1, 2,3,4,5,7, 8,9, 11, and, in these cases, Sylow’s Theorem 
asserts that G has a subgroup of order n. We should point out that in these cases we really only 
need to use Sylow’s Theorem when |G| = 4, 8, 9 as the cases where |G| = 1, 2,3,5,7, 11 
follow immediately from the fact that {e} and G are always subgroups of G. 


In remains to consider the cases where |G| = 6, 10. In both of these cases, |G| = pq, where p 
and q are different primes. Since {e} and G are subgroups, G certainly has subgroups of order 
1 and pg. However, we can now use Sylow’s Theorem to assert that G also has subgroups of 

order p and q, thereby covering the remaining n € N, which divide |G|. 


We should point out that since 13, 14, 15, 16, 17 are also of the form p” or pg, where p 

is a prime and q is a different prime, Corollary 8.30 holds for groups of these orders. 
Corollary 8.30 indicates that groups of order at most 11 are, in some ways, not as complicated 
as some other groups. We will now present a table of all groups, up to isomorphism, of order 
n < 11. Since being isomorphic is an equivalence relation, the table will provide an example 
of one group per equivalence class. Before presenting this table, we need one more example. 


Suppose we let G = {1, —1, i, —i} be the 4 fourth roots of 1 in C. We will use G to construct a 
larger group. Let j have the property that j? = —1. Thus, j behaves like i. However, when we 


318 Chapter 8 


multiply i and j, the order matters as ji = —ij. In this situation, the set 


Qu = {1, -1,i, —i, 7, —j, ij, —ij} 
is anonabelian group with eight elements. We use the notation Qu as these eight elements are 
a subset of a very important noncommutative ring known as the Quaternions. 


The proofs that our table contains all groups, up to isomorphism, of order at most 11 will 
appear in several exercises at the end of the next section. They follow from corollaries in this 
and the next section. In this table, for n > 1, we will let C,, denote a cyclic group of order n. 


|G|=1 Ci 

|G| =2 C2 

|G| =3 C3 

|G|=4 C4 and C2 x C2 (both abelian) 

|G|=5 Cs 

|G| =6 Co and $3 (one abelian, one nonabelian) 
|G|=7 C7 

|G|=8 Cg, C4 x Co, C2 x C2 X Co, Da, and Qu (three abelian, two nonabelian) 
|G|=9 Cg and C3 x C3 (both abelian) 

|G|=10 Cio and Ds (one abelian, one nonabelian) 
|IG|=11 Ci 


Exercises for Section 8.2 


Exercises 1-13, as well as several others, will refer to the following multiplication table for 
the group Qu. 


° 1 —1 i —i J —j ij —ij 
1 1 —1 i i J —j ij —ij 
—1 —1 1 -i i —j J —ij ij 
i i —i —1 1 ij —ij —j J 
I —i i 1 —1 —ij ij J —j 
J J —j —ij ij —1 1 i —i 
—j —j J ij —ij 1 —1 —i i 
ij ij —ij J —j I i —1 1 
—ij —ij ij —j J i I 1 —1 


1. In Qu, write down the elements of the subgroups < 1 > and C(1). 
2. In Qu, write down the elements of the subgroups < —1 > and C(—1). 


3. In Qu, write down the elements of the subgroups < i > and C(i). 


SOP 200e | Os Yo 


15. 


16. 
17. 
18. 
19. 


20. 
21. 
22. 
23. 
24. 


25. 


26. 


Group Theory 319 


In Qu, write down the elements of the subgroups < —i > and C(—i). 

In Qu, write down the elements of the subgroups < j > and C(j). 

In Qu, write down the elements of the subgroups < —j > and C(—/). 

In Qu, write down the elements of the subgroups < ij > and C(i)). 

In Qu, write down the elements of the subgroups < —ij > and C(—ij). 

In Qu, write down the elements in all the right cosets and all the left cosets of < —1 >. 
In Qu, write down the elements in all the right cosets and all the left cosets of <i>. 
In Qu, write down the elements in all the right cosets and all the left cosets of < j >. 
In Qu, write down the elements in all the right cosets and all the left cosets of < ij >. 
In Qu, determine the number of elements of order 1, 2, 4, and 8. 


In D4, determine the number of elements of order 1, 2, 4, and 8. You might want to look 
at exercises 37-43 from the previous section. 


Based on the previous two exercises and exercise 68 from the previous section, show that 
Qu and Dy are not isomorphic. 


In the group Cg, determine the number of elements of order 1, 2, 4, and 8. 
In the group C4 x C2, determine the number of elements of order 1, 2, 4, and 8. 
In the group C2 x C2 x C2, determine the number of elements of order 1, 2, 4, and 8. 


Use exercises 15—18 to show that there are at least five different groups, up to 
isomorphism, with eight elements. 


Find all the elements in Z(Qu). 

In n > 3, show that y”x has order 2 in D,. 

If n > 3 is odd, determine the number of elements of order 2 in D,. 
If n > 3 is even, determine the number of elements of order 2 in D,. 


In D4, write down the elements in all the right cosets and left cosets of the subgroup 
{e, x}. Is {e, x} normal? 


In D4, write down the elements in all the right cosets and left cosets of the subgroup 
{e, yx}. Is {e, yx} normal? 


In D4, write down the elements in all the right cosets and left cosets of the subgroup 
{e, y?x}. Is {e, y’x} normal? 


320 Chapter 8 


27. 


28. 


29. 


30. 
31. 
32. 
33. 


34. 
35; 
36. 
37. 
38. 
39. 
40. 


41. 


42. 


43. 


In D4, write down the elements in all the right cosets and left cosets of the subgroup 
{e, yx}. Is {e, y>x} normal? 


In D4, write down the elements in all the right cosets and left cosets of the subgroup 
{e, y}. Is {e, y?} normal? 


In D4, write down the elements in all the right cosets and left cosets of the subgroup 
{e, y, v7, y°}. Is {e, y, y?, y?} normal? 


If n > 3, show that Z(D,) = C(x) NC(y). 
If n > 3 is odd, use exercise 30 to show that Z(D,,) = fe}. 
If n > 3 is even, use exercise 30 show that Z(D,,) = {e, y2 b 


Let t € S,, where n > 3. 


(a) If tis not the identity map, show that there exist i, 7 € {1,2,...,n} such thati 4 j 
and t(i) = j. 


(b) Suppose k € {1,2,...n} such that k Ai and k 4 j. Ifo € S, such that o(i) =i and 
o(j) =k, show that (to 0)(i) 4 (coT)(i). 


(c) Use parts (a) and (b) to show that if n > 3, then Z(S,,) consists only of the 
identity map. 


For which values of n does U(Zj9) have a subgroup of order n? 
For which values of n does U(Zj2) have a subgroup of order n? 
For which values of n does U(Zj4) have a subgroup of order n? 
For which values of n does U(Zj6) have a subgroup of order n? 
For which values of n does U(Zj8) have a subgroup of order n? 
For which values of n does U(Z29) have a subgroup of order n? 


If G is a group with p” elements, where p is prime and m € N, for which values of n 
does G have a subgroup of order n? 


If G is a group with n elements and g € G, show that g” = e, where e is the identity 
element. 


Suppose H, K are subgroups of a group G where | #| = 2 and | K| = 3. Show that 
|HNK|=1. 


Suppose H,, K are subgroups of a group G where |H| = p and |K| = q, where p and q are 
different primes. Show that |HN K| = 1. 


Group Theory 321 
44. Suppose H, K are finite subgroups of a group G such that the orders of H and K are 
relatively prime. Show that HM K contains only one element. 


45. If G is a nonabelian finite group and g € G, show that |C(g)| > |Z(G)|. You might want 
to consider separately the cases where g € Z(G) and g ¢ Z(G). 


46. If G isa group of order pm, where p is a prime and m € N, show that Z(G) cannot have 
exactly m elements. You might want to use exercise 45 and then separately consider the 
cases where G is abelian and where G is nonabelian. 


47. Suppose G is a group of order pq, where p and q are primes which are not necessarily 
different. Show that if Z(G) has more than one element, then G is abelian. 


48. If G isa group where every element has order | or 2, show that G must be abelian. 
49. Use exercise 48 to show that every group of order 4 must be abelian. 


50. Let G bea group and let ¢: G > G be defined as $(g) = g~!, for all g € G. Show that if 
¢ is ahomomorphism, then G is abelian. 


51. Let G bea group and let p : G — G be defined as p(g) = g’, forall g eG. 
(a) Show that if G is abelian, then p is a homomorphism. 


(b) Show that if p is a homomorphism, then G is abelian. 


52. Let G = AH, x Hy, where H; and HA) are groups. Show that G is abelian if and only if 
both Hj, and H> are abelian. 


53. Let G= AH, x Ho, where H, and Hp are groups. Show that if G is cyclic then both A, 
and A) are cyclic. 


54. Suppose Hj and A) are finite cyclic groups. Show that H; x Hp is cyclic if and only if 
|| and | H| are relatively prime. 


In exercises 55-62, ¢ is a homomorphism of groups. Describe, as simply as possible, the 
elements of Ker(@) and Im(@). 


55. @:C* > C%, where ¢(@) = |a|, for all a € C*. Note that C* is a group under 
multiplication. 


56. ¢:C* > C%, where $(a) = a+, for all a € C*. Note that C* is a group under 
multiplication. 


57. &:R* > R%, where o(a) = a4, for all a € R*. Note that R* is a group under 
multiplication. 


58. @: R[x] — R, where ¢(f(x)) = f(0), for all f(x) € R[x]. Note that R[x] and R are 
groups under addition. 


322 Chapter 8 


59. @:Z[x] > Q, where $(g(x)) = g(0), for all g(x) € Z[x]. Note that Z[x] and Q are 
groups under addition. 


60. o: D3 — R%, where $(y'x/) = (—1)/, for 0 <i < 2 and 0 < j < 1. Note that R* isa 
group under multiplication. 


61. 6: Zo — Za, where $((i]20) = [i], for all i € Z. Note that Zo and Z,4 are groups 
under addition. Before examining Ker(¢) and Jm(@), you should first check that the 
function ¢ is well defined. 


62. 6:Z— C%, where ¢(n) = cis(*%*), for all n € Z. Note that Z is a group under addition 
and C* is a group under multiplication. 


63. Show that any subgroup of a cyclic group is cyclic. 


8.3 Solvable Groups 


In the previous two sections, we were introduced to some of the basic concepts of finite group 
theory. In the next two sections, we focus on the group-theoretic topics needed for Galois’ 
proof of the insolvability of the quintic. 


Remember that for a normal subgroup the right cosets and left cosets corresponding to the 
subgroup are equal. Furthermore, Lagrange’s Theorem told us how many cosets there are, and 
Theorem 8.21 showed that these cosets form a group in their own right. This leads us to 


Definition 8.31. Jf G is a group with normal subgroup N; we let G/N denote the set of 
cosets corresponding to N. Then G/N is a group and we call it a factor group or quotient 


group. In addition, if G is finite, then |G/N|= Tae 


Recall that for any group G, both G and {e} are normal subgroups. Therefore, we can always 
form the factor groups G/G and G/{e}. Observe that G/G is merely a group with one 
element, and this element is a coset containing every element of G. At the other extreme, 
every element of G/{e} is a coset containing only one element of G. Furthermore, the function 


o:G— G/{e} 


defined as #(g) = {e}g is an isomorphism, so G/{e} is isomorphic to G. 


Factor groups are often difficult for students to deal with as they present us with a new level of 
abstraction. Groups are sets whose elements satisfy various nice properties under a binary 
operation. Often, the elements of a group are very concrete objects like integers, rational 
numbers, or functions. However, when dealing with factor groups, the elements of our groups 
are cosets, so they are sets in their own right. When dealing with new concepts that are quite 


Group Theory 323 


abstract, it is always helpful to have concrete examples to look at. Fortunately, in trying to 
understand factor groups, we can always reexamine the construction of Z, from Z to help 
reinforce these new ideas. 


Let us now look at three other examples of factor groups. 


m Examples 


1. Let G = $3, and, using the notation for $3 of the previous two sections, let N be the 
three element subgroup {e, j,k}. As we saw in Section 8.2, N is a normal subgroup, 
and there are two cosets, each with three names: Ne = Nj = Nk and Nf = Ng= Nh. 
Since N is normal, there is no difference between a right coset and a left coset, 
and, more importantly, the name of the coset does not affect the answer when 
multiplying. If we use the names Ne and Nf for our two cosets, then we obtain the 
following table for the factor group $3/N: 


° Ne Nf 
Ne Ne Nf 
Nf Nf Ne 


2. Now let G=C¢ = f{e, g, 2°, 2°, gt, @} be the cyclic group with six elements and let 
H =< g’ >= {e, g”, 9*} be the cyclic subgroup generated by g”. Since G is abelian, 
all of its subgroups are normal. As in the previous example, there are again two 
cosets, each with three names: He = Hg”? = Hg*t and Hg = Hg? = He”. If we use the 
names He and Hg for the cosets, then the factor group C¢/H has table 


° He Hg 
He He He 
He He He 


3. Let G = Du, the fourth dihedral group. Recall that every element of D4 is of the form 
yixs, where 0 <i <3,0<j<1, y4 == e,andxy= ee The center of every group 
is always a normal subgroup and, in this case, Z(D4) = {e, y7}. Since D4 has order 8 
and Z(D4) has two elements, the factor group D4/Z(D4) consists of four cosets 
each with two names: 


Z(Da)e = Z(Da)y*, Z(Da)x = Z(Da)y°x, 


Z(Da)y = Z(Da)y®, and Z(D4)yx = Z(Da)y°x. 


www-.elsevierdirect.com 


324 Chapter 8 


Using the names Z(Da)e, Z(D4)x, Z(D4)y, Z(D4) yx for the cosets, then the factor 
group D4/Z(D4) has table 
fo) Z(Da)e Z(D4)x Z(Da)y Z(D4) yx 
Z(Daye Z(Daye Z(D4)x Z(Da)y Z(D4) yx 
Z(D4)x Z(D4)x Z(Da)e Z(D4) yx Z(Da4)y 
Z(Da)y Z(Da)y Z(D4) yx Z(Da4)e Z(D4)x 
Z(D4) yx Z(D4) yx Z(Da)y Z(D4)x Z(Da)e 
L | 


Let us now examine what the previous examples tell us. If G is a group with normal subgroup 
N, then it should come as little surprise that both the factor group G/N and the subgroup N 
inherit properties from G. For example, if G is abelian, then both G/N and N are abelian. 
This is certainly the case in our second example where G = C6. 


It is now natural to wonder if information about G/N and N can be used to obtain information 
about G. One piece of information about G that can be obtained by looking at G/N and N is 
its size. If G is a finite group, then the proof of Lagrange’s Theorem tells us that 


|G| =|G/N|-|NI. 


In fact, if we merely assumed that |G/N| and |N| were finite but did not assume that |G| was 
finite, the proof of Lagrange’s Theorem would still show that |G| = |G/N|-|N|. Thus, the 
sizes of G/N and N do tell us the size of G. 


However, the situation is quite different if we look at the property of being abelian. First we 
introduce some notation. If G; and G2 are isomorphic, we write G; © G2, but if G; and 
G» are not isomorphic, we write G; % Gp. In our first example, 


S3/NCy and N&C3. 


Therefore, both $3/N and WN are abelian, yet S3 is not abelian. In fact, when we compare the 
first two examples, we can see that 


S3/NC6/H and NH. 


Thus, the factor groups in the first two examples are isomorphic, and the normal subgroups in 
the first two examples are also isomorphic. However, we certainly know that $3 % C6. 


Based on these examples, we see that a complete understanding of both the factor group G/N 
and the normal subgroup WN still does not completely determine the structure of G. When we 
look at our third example, we have 


Da/Z(D4) *C2x Cz, and Z(D4) * Cp. 


Group Theory 325 


Thus, both D4/Z(D.4) and Z(D4) are abelian, yet D4 is not abelian. When you look at the 
table for D4/Z(D4), you can see that the order of every element of the factor group is either 
one or two. Thus, D4/Z(D4) is not cyclic. This explains why D4/Z(D4) # C4. Since there are 
only two groups of degree 4, it immediately follows that D4/Z(D4) © C2 x Co. 


In all three of our examples, G contained a normal subgroup N such that both G/N and N 
were abelian. Although G need not be abelian, we can say that G is, at worst, only one step 
away from being abelian. Soon, we will generalize this idea and look at groups that may not be 
abelian but are only a finite number of steps away from being abelian. You might wonder if 
this is merely generalizing for the sake of generalizing or whether such a concept has 
applications. It turns out that this is exactly the group-theoretic concept needed for Galois’ 
work on the insolvability of the quintic. This leads us to 


Definition 8.32. If G is a group, we say that G is solvable if there exists a chain of subgroups 


G=Gpo 2 Gi 2 G2-+- D Gn_1 D> Gn = {e} 


such that, for 0 <i<n-—i, Gj4, is anormal subgroup of G; and the factor group G;/Gj+, is 
abelian. 


As we will see in Chapters 15 and 17, to every polynomial p(x) in Q[x], we can associate a 
field L such that Q C L C C. Galois showed, and we will show in Chapter 17, that if p(x) 
is solvable in radicals then the Galois group Gal(L/Q) is a solvable group. By producing 
fifth-degree polynomials p(x) such that the Galois groups Gal(L/Q) were not solvable, 
Galois showed that fifth-degree polynomials are not solvable in radicals. 


In mathematics, it is often useful to have several different ways to look at the same concept. 
Therefore, we now present another way to look at solvable groups. 


Proposition 8.33. A group G is solvable if and only if there exists a chain of subgroups 


G = Go 2 G; 2 G2: D Ga_-1 D Gn = {e} 


such that whenever a, b € G;, we have aba~'b7! € Gi41, forO <i<n-1. 


Proof. In one direction, suppose G is a group with a chain of subgroups 


G = Go > G; > G2:-- D Gy_-1 D> Gn = {e} 


that satisfy the conditions in Definition 8.32. We need to show that for all0 <i<n—1,if 
a, b € G; then aba~'!b7! € Gia1. 


By Definition 8.32, the factor group G;/Gj+1 is abelian. Therefore, 


(Gi414)(Gi41b) = (Gi41b)(Gi414). 


326 Chapter 8 


By the definition of coset multiplication, this means that G;,,ab = Gj+,ba. Since ab and ba 
belong to the same coset corresponding to G;,1, we know that (ab)(ba)~! € Gj4,. However, 
(ba)! =a~'b-!, so 


aba 'b~! = (ab)(ba)~! € Git, 


as desired. 


In the other direction, suppose 
G = Gp 2G, 2G2--» DGn-1| D> Gn = {e} 


satisfies the conditions of this proposition; then we need to show that Gj; is a normal 
subgroup of G; and the factor group G;/Gj+, is abelian, for0 <i<n—1. 


In order to prove that G;,1 is a normal subgroup of G;, we need to show that if a € Gj, 
then aG;,, = G;+1a. Therefore, it suffices to show that if g € G;+1, then ag € Gj, ,a and 
ga € aG;+ 1. Since both a and g belong to G;, we know that aga~'g™! € Gi41, SO 
aga—'g~! =h, for some h € G4. We now have 


ag = (ag)e = (ag)(a~'g™' ga) = (aga! g~')(ga) = h(ga) = (hg)a. 
Since both A and g belong to Gj+1, we see that hg € Gj; and 
ag = (hg)a € Gi+14. 


Similarly, since g~', a~! € Gj, it follows that 


1 


g'ia'ga=g'a‘ig')'@')' é€ Gin. 


1 


Therefore, g~ a~'ga =k, for some k € G1. Thus, 


ga = e(ga) = (agg ‘a ')(ga) = (ag)(g"'a'ga) = (ag)k = a(gh). 
Since both g and k belong to G;+1, we know that gk € Gj41, so 


ga = a(gk) € aGj+1. 


To conclude the proof, we need to show that the factor group G;/G;+, is abelian. Therefore, if 
a,b € G;, we need to show that 


(Gi+14)(Gi41b) = (Gi41b)(Gi414). 


Group Theory 327 


We know that aba~'b7! € G41. However, aba~!b~! = (ab)(ba)~!, which tells us that ab and 
ba belong to the same coset corresponding to Gj+1. As a result, 


Gi+1ab = Gi+1ba, 


which, by the definition of coset multiplication, immediately implies that (Gi+1a)(Gj.1b) = 
(Gi41b)(G;+14a), as desired. 


Having defined solvable groups and having briefly discussed their connection to Galois’ work, 
it is time to try to determine which familiar groups are solvable and which are not. Certainly, 
abelian groups are solvable. We next turn our attention to dihedral groups. To examine them, 
we need 


Lemma 8.34. Let G be a finite group with a subgroup H such that |G| =2-|H|. Then H is a 
normal subgroup of G. 


Proof. We need to show that for every g € G, Hg = gH. The proof will be surprisingly easy if 
we look at things the right way, and in this situation, the easiest way to proceed is by 
separately considering the cases where g € H and g ¢ H. 


If g € H, then g and e belong to the same right coset, so Hg = He. Similarly, if g € H, then g 
and e also belong to the same left coset, so gH = eH. However, it is easy to see that the sets 
He and eH are equal to the set H, so 


Hg = He= H=eH = gH. 


Observe that this part of our proof did not require that |G| =2-|A|. 


Now suppose g ¢ H; since the index of H in G is two, it follows that G is the union of the 
two disjoint cosets Hg and He. Therefore, Hg consists of the |H| elements of G that do not 
belong to He = H. Using the same reasoning, gH also consists of the || elements of G that 
do not belong to eH = H. Thus, the sets Hg and gH are the same, as needed. 


It is now easy to prove 


Corollary 8.35. The dihedral group D, is solvable, for alln > 1. 


Proof: The group D,, consists of the 2n terms of the form y'x/, where 0 <i<n—1,0<j<1, 
y” = x* =e, and xy = y"~'x. If we let H be the cyclic subgroup generated by y, then 


H=<y>={e,y,y’,...,y" |}. 


328 Chapter 8 


Since |H| =n and |G| = 2n, we can apply Lemma 8.34 to assert that H is a normal subgroup 
of G. Therefore, we now have the chain 


G2H 2 {e}, 
where H is normal in G, {e} is normal in H, and the factor groups 


G/H * Cy, Hf/{e} ~H *C, 


are both abelian. As a result, D,, is solvable. 


Having shown that D,, is solvable for all n > 2, it is reasonable to investigate when symmetric 
groups are solvable. Since S; and S$» are abelian, we know they are solvable. Earlier in this 
section, we saw that S3 is solvable, and at the end of Section 8.4, we will show that S4 is also 
solvable. However, this is as far as we can go with symmetric groups. In Section 8.4, we will 
show that S, is not solvable for all n > 5. The fact that S5 is not solvable will turn out to be a 
very important piece of the puzzle in proving the insolvability of the quintic. 


The next result will be very useful in examining groups of order p”. It is another illustration of 
the importance of equivalence relations and equivalence classes in abstract algebra. 


Theorem 8.36—The Class Equation. Jf G is a finite group, then G contains a subset 
{21,---,&m} ofm = 0 elements that do not belong to Z(G), such that 


ICe(gi)| ICE(g2)I ICo(8m)| 


|G| =|Z(G)|+ 


For an abelian group G, the Class Equation does not provide any new information about G as 
in this case m = 0, and the equation merely says that |G| = |Z(G)|. However, for various 
nonabelian groups, the Class Equation is very useful in examining the size of Z(G). At this 
point, it is probably unclear how we go about using the Class Equation. Therefore, we will 
provide several applications before proving Theorem 8.36. 


Theorem 8.37. Let G be a group with p” elements, where p is prime andn > 1. Then Z(G) 
has at least p elements. 


Proof. If G is abelian, there is nothing to prove as |Z(G)| = |G| > p. On the other hand, if G 
is nonabelian, we can look at the class equation 


ICa(ev)l ” ICa(e2)I [Co (8m)| 


IG| =|Z(G)|+ 


and note that m > 1. 


Group Theory 329 


Since |G| = p”, Lagrange’s Theorem tells us that if 1 <i < m, then 


|G| ot 
ICG (gi) 


for some t > 0. However, since g; ¢ Z(G), we know that Cg(g;) is not all of G. Hence, 


’ 


|G) _IGL_: ; 
Covent > 1+ 8° Foote 18 a multiple of p. 
When we look at the class equation, |G| is a multiple of p, and every term of the form oH 


is also a multiple of p. Since we can rewrite the class equation as 
ICe(gl  ICe(g2)! ICa(gm)|]” 


we see that |Z(G)| is also a multiple of p. Since e € Z(G), we know that |Z(G)| is both at 
least one and also a multiple of p. This tells us that |Z(G)| > p, as desired. 


IZ(G)| = iai-( 


Note that Theorem 8.37 says nothing about groups of order 6 but does guarantee that for groups 
with eight elements, the center must have at least two elements. This is perfectly consistent 
with our previous observations that |S3| = 6, |Z($3)| = 1, |D4| = 8, and |Z(D4)| = 2. We can 
now use Theorem 8.37 to completely describe up to isomorphism groups of order p?. 


Corollary 8.38. If G is a group of order p*, where p is prime, then G is abelian. 
Furthermore, either G * Cy or G® Cp X Cp. 


Proof. By way of contradiction, suppose G is not abelian. There exists b € G such that 
b ¢ Z(G). We will now examine Cg(b). 
Certainly Cg(b) contains Z(G). By Theorem 8.37, |Z(G)| > p; therefore, 


IC (6)| 2 |Z(G)| = p. 


On the other hand, b belongs to Cg(b) and does not belong to Z(G), so the previous inequality 
now becomes 


|Cg(b)| > |Z(G)| > p. 


Lagrange’s Theorem tells us Cg(b) must divide p~. However, we know that |Cg(b)| > p, so it 
must be the case that |Cg(b)| = p* = |G]. As a result, Cg(b) is equal to all of G, which means 
that b commutes with every element of G. But this contradicts the fact that b ¢ Z(G), so G is 
abelian. 


By Lagrange’s Theorem, every element of G has order 1, p, or p”. If G contains an element of 
order a then G is cyclic, and we know that G+ C pe Therefore, it only remains to consider 
the case where every element of G, other than the identity, has order p. 


330 Chapter 8 


Let a € G such that o(a) = p and then let b € G such that b ¢< a >. It is clear that b also has 
order p, and we now consider the set 


H = {a'b/ |0 <i, j < p—1}. 


Since a? = b? = e and ab = ba, we can see that H is closed under multiplication, so by 
Proposition 8.6(b), H is a subgroup of G. By Lagrange’s Theorem, || must divide p. 
However, not only does H contain the p elements in < a >, but it also contains b. Therefore, 
H has more than p elements, so |H| = p? = |G|. Thus, every element of G can be written 
uniquely in the form a'b/, where 0 < i, J<p-l. 


Next, if we let g € C, such that < g >= Cp, we can define the function 
o:G>C,xCy 

as 
p(a'b!) = (g', g/), 

for allO <i, 7 < p—-1. 


It is easy to see that ¢ is a bijection from G to C, x Cy. To show that ¢ is an isomorphism, 
let hy, hy € G; we need to show that @(hih2) = d(h1)o(h2). There exist i, 7, k, 1, with 

0 <i, j,k, 1 < p—1, such that h; = a'b/ and hz = a‘b!. If we let n and m be, respectively, the 
remainders when i+ and j+/ are divided by p, we now have 


b(hyh2) = o((a'b/)(a*b')) = pai kb!) = o(a"b™) = (g", 8") = 
(git*, ot) = (g', o/)(g*, 2!) = p(a'b/)o(a‘b') = o(11)b(h2), 


as required. 


Having seen some applications of the Class Equation, it is now time for its proof. 


Proof of Theorem 8.36—The Class Equation. As mentioned earlier, this proof will be another 
illustration of the importance of equivalence relations and equivalence classes in abstract. If 
g,h € G, we define ~ as g ~ h when there exists a € G such that h = a~! ga. We want to show 
that ~ is an equivalence relation, so we need to show that it is reflexive, symmetric, and 
transitive. 


If g € G and if e is the identity element of G, then 
g=ege= e! ge. 


As a result, for every g € G, g ~ g, thus ~ is reflexive. 


Group Theory 331 


To show that ~ is symmetric, suppose g, h € G such that g ~ h. We need to show that 
h ~ g. Since g ~ h, there exists a € G such that h = a~' ga. If we take this equation and 
solve for g, we obtain 


g=aha!=(a'!)'ha"!. 


If we let b = a~!, then the previous equation becomes g = b~'hb, and since b € G, we see that 
h ~ g. Thus, ~ is also symmetric. 


Next, to show that ~ is transitive, suppose g, h, k € G such that g ~ h andh ~ k. We need to 
show that g ~ k. We know that there exist c,d € G such that h = c7! gc and k = d~'hd. 
Substituting the first equation into the second results in 


k=d'hd=d'(c"!gc)d = (d~'c~!)g(cd) = (cd)! g(cd). 


Since cd € G, the preceding equation tells us that g ~ k, so ~ is also transitive and is therefore 
an equivalence relation. 


Having shown that ~ is an equivalence relation, the next step is to start looking at the 
equivalence classes. Observe that if g € G, then the equivalence class [g] consists of the 
elements of G of the form a~! ga, where a ranges through all |G| elements of G. Certainly not 
all |G| choices for a produce different elements of [g]. In fact, it is not hard to see that [g] 
consists of the single element g if and only if g € Z(G). Next, let Aj, Ao, ..., Aj be the 
equivalence classes that contain more than one element. The facts that every element of G 
belongs to exactly one equivalence class and the number of elements whose equivalence class 
contains only one element equals |Z(G)| can now be expressed in the equation 


(4) |G] =|Z(G)|+]Ai| +]A2|+---+]Aml- 


The question remains, if g ¢ Z(G), then how do we determine the size of [g]? If a, b € G such 
that a~! ga = b~' gb, then multiplying on the left by b and on the right by a~! yields 


ba-'!g=gba!. 


As a result, ba~! commutes with g, so there exists h € C(g) such that ba~! = h. This tells 
us that 


b=hae C(g)a. 


Therefore, we now know that a and b belong to the same right coset corresponding to C(g), so 
C(g)a = C(g)b. 


On the other hand, if C(g)a = C(g)b, then a = hb, for some h € C(g). Therefore, 


a ga = (hb)'!g(hb) =b-'h7!ghb = b'h“'hgb =b"' gb. 


332 Chapter 8 


The preceding arguments combine to show that if a, b € G, then a~! ga = b~' gb if and only if 

C(g)a = C(g)b. As a result, the number of elements of the form a~! ga is the same as the 

number of right cosets corresponding to the subgroup C(g). However, Lagrange’s Theorem 
|G] 


already told us that the number of right cosets corresponding to C(g) is equal to cen" 


For 1 <i <m, choose some g; € A;. Therefore A; = [g;] and, using our previous observations, 


we have succeeded in showing that |A;| = racah |. Substituting this into (4) now gives us 
ee 
ICe(gi)| |Ca(g2)| ICe(gm)I’ 


thereby concluding the proof. 


AS we saw earlier in this section, a group G need not be abelian even if a normal subgroup NV 
and the factor group G/N are both abelian. If we think of solvable as being, at most, a finite 
number of steps away from being abelian, then it is reasonable to ask, if both N and G/N are 
solvable, must G also be solvable? After all, if N is n steps away from being abelian and G/N 
is m steps away, then perhaps G is, at most, n +m steps away. In order to prove this, we need 
a technical result which has many applications in group theory. 


Theorem 8.39—The Isomorphism Theorem. Let ¢ : G, — G2 be a homomorphism of 
groups. 


(a) Ifwe let N denote the kernel of $, then the factor group G,/N is isomorphic to Im(@). 
(b) Jf A, is a subgroup of G,, then @(A,) is a subgroup of G. 


(c) If Hp is a subgroup of G2, then d~'(Hy) = {g € G1 | 6(g) € Ho} is a subgroup of G, 
containing N. 


(d) If Ho is anormal subgroup of G2, then ~'(H2) is a normal subgroup of G. 
Proof. By Theorem 8.24, we know that N is a normal subgroup of G; and Im(@) is a 


subgroup of G2. In this proof, we will let e;, e2 denote, respectively, the identity elements of 
G| and Go. 


For part (a), we need to find a homomorphism 
v:G/N > Im(@) 


which is also a bijection. When we multiply the cosets in G/N, we do so by multiplying their 
names. Therefore, it is reasonable to define v by using the names of the cosets. Since ¢ is 
defined on all elements of G, this suggests defining v as 


v(Ng) = (8), 
for all g EG. 


Group Theory 333 


We first need to show that v is well defined. This means that changing the name of a coset will 
not change the value of v. To this end, suppose Ng = Nh, for some g, h € G,, and we need to 
show that v(Ng) = v(Nh). Since Ng = Nh, we know that g € Ng = Nh, so g = xh, for some 
x € N.As aresult, 


v(Ng) = $(8) = Oh) = G(x) O(h) = e26(h) = o(h) = v(NA), 


as required. 


Next, to show that v is surjective, we must show that if y € Jm(@), then y € Im(v). There exist 
some g € G; such that y = $(g), so 


y= (g) = v(Ng) € Im(v), 


as required. 
In order to prove that v is injective, we must show that if g, # € G,; such that v(Ng) = v(NA), 
then Ng = Nh. Since v(Ng) = v(Nh), we know that $(g) = ¢(h). Therefore, 


o(gh') = $(g)o(h') = o(g)o(h) | = o(h)b(h) | =e. 


As a result, gh! € N, so the cosets Ng and NA are the same. Thus, v is indeed a bijection. 
To complete the proof of part (a), if g, 4 € G1, we can use the definition of coset multiplication 
and the fact that ¢ is a homomorphism to see 

v((Ng)(Nh)) = V(N(gh)) = o(gh) = (8) o(h) = v(Ng)v(NA). 
Thus, v is also a homomorphism, so v is an isomorphism. Therefore, G/N and Im(@) are 
isomorphic. 


For part (b), in light of Proposition 8.6(a), it suffices to show that if x, y € (Aj), then 
xy! € @(A,). We know that there exists g, h € Hj such that x = $(g) and y = (h). Since 
is ahomomorphism and gh7! € Hy, it follows that 


xy | =$(g)¢(h) | = o(g)b(h') = (gh!) € o(A)). 


For part (c), we again use Proposition 8.6(a), and it suffices to show that if g,h € | (Ab), 
then gh—! € @-'(A). By the definition of ¢~'(H), both $(g) and @(h) belong to Hp. Since 
Hp is a subgroup of G2, we know that $(g)@(h)~! € Hy. Using that ¢ is a homomorphism, 
it follows that 


o(gh"') = $(g)o(h!) = b(g)b(h) | € Ad. 


334 Chapter 8 


The preceding equation indicates that gh~! € ~'(H2), so ¢~!(H2) is a subgroup. 
Furthermore, 


P(N ) = {e} © Hh, 


so @~!(H) contains N, as required. 


For part (d), we may now assume that the subgroup A from part (c) is also normal. If a € G; 
and g € ¢ |'(H), we must show that 


age gd '(Hy)a and gacad '(M). 
The fact that H2 is normal implies that 
$(a)6(g) € $(a) Hy = H2b(a) and 9$(g)(a) € H2(a) = $(a) Ad. 
Therefore, there exist x, y € H» such that 
¢(a)o(g) =xb(a) and $(g)d(a) = (ay. 
Since @ is a homomorphism, the previous equations imply that 
¢(aga') = $(a)b(g)¢(@) | = x9(a)o(a) | =x € Hr 
and 


b(a_'ga) = $(a~')6(g)$(a) = O(a) 'b(@y=ye Mh. 


1 


Therefore, aga~!, a~' ga € @~'(H>). We now have 


ag=(aga')ae @'(Hx)a and ga=a(a~'ga) € ag! (Ad), 


as desired. 


We now have the tools needed to prove 


Corollary 8.40. Jf G is a group with normal subgroup N, then G is solvable if and only if 
both G/N and N are solvable. 


Proof. In one direction, suppose G is solvable. Then, by Proposition 8.33, there exists a chain 
of subgroups 


G = Go D> Gj D G2: -: D Gy_-1 D> Gn = {e} 


where aba~'b7! € Gi+1, whenever a,b € G; andO0 <i<n-1. 


Group Theory 335 
Let N; = NOG,, for 0 <i <n—1; since N is a subgroup, it follows that if a, b € N; then 
aba"'b-!e NN Gi41 = Ni41- 
Therefore, the chain of subgroups of N, 


N=No2 Ni 2 N2--: 2 Na-1 > Nn = {el}, 


shows that N is solvable. 


Next, we need to show that if G is solvable, then so is G/N. Let H = G/N and consider the 
function 


¢:G— H, 
where $(g) = Ng. Observe that ¢ is a homomorphism as 
o(gh) = N(gh) = (Ng)(Nh) = $(8)o(h), 
for all g,h € G. Therefore, we are in a position to apply Theorem 8.39. 


By Theorem 8.39(b), 6(G;) is a subgroup of H, for 0 <i <n-—1. Therefore, H has a chain of 
subgroups 


H = $(Go) 2 (G1) 2 (G2) +++ 2 b(Gn-1) 2 O(Gn) = ten}, 


where ey is the identity element of H. Furthermore, if a, b € ¢(G;), for 0 <i <n-—1, then 
a= $(x) and b = $(y), where x, y € G;. Since ¢ is a homomorphism and xyx~!y~! € Gj41, 
we have 


aba"'b~' = $(x)$()O(x) 160)! = ()O0)O@)OQ7!) = 
o(xyx ly!) € $(Gi41). 


Therefore, our chain of subgroups for H satisfies the conditions of Proposition 8.33, so 
G/N = His solvable. 


Note that the previous portion of our argument actually shows that the image under a 
homomorphism of a solvable group is also a solvable group. 


In the other direction, let us assume that both N and G/N are solvable. Since N is normal, 
there exist a chain of subgroups 


N=No > WN, D No---D Na-1 D> Nn = {e} 


such that aba~!b7! € Ni+1, whenever a,b € Nj and0 <i<n-—1.Ifwelet H=G/N and } 
be as previously, then there also exists a chain of subgroups 


H =H) > AH > M--- D An-1 > Hm = {en}, 


336 Chapter 8 


where e# is the identity of H and aba~'b~' «€ Hj.,, whenever a,b € H; and0 <i<m-—1. 


By Theorem 8.39(b), @~!(H;) is a subgroup of G, for 0 < i < m—1 and G has the chain of 
subgroups 


G=¢ '(Mo) 2 ¢' (M1) Do 'U)--- DO '(An—1) 2 O| (Am) = N. 


Next, if 0 <i <m-—1 and if we let a, b € (H;)~|, then we have $(a) = x and $(b) = y, 
where x, y € H;. Since ¢ is ahomomorphism and xyx~!y~! € Hj, we have 


o(aba_'b~') = d(a)b(b) (a) 6(b') = o(G(b) O(a) '$(b)! = 
xyx ly"! € Hiqt. 


Therefore, aba~'b~! € @~'(Hj41). As a result, the chain of subgroups for G satisfies all the 
conditions of Proposition 8.33 except that it does not go all the way down to the subgroup {e}. 
However, if we place the chain of subgroups we obtained for N at the end of the chain of 
subgroups for G, we obtain 


G=¢ '(Mo) 2 ¢'(M) 26 'C)--- 2 6 '(An-1) 2 | (Am) = N= 
No 2 Ni 2 N2-++ 2 Nn-1 2 Nn = {e}- 


Observe that this new chain satisfies all the properties required in Proposition 8.33, so G is 


indeed solvable. 


We have now developed quite a few tools that, when combined with Lagrange’s Theorem and 
Sylow’s Theorem, allow us to prove that many large classes of finite groups are solvable. In 
particular, we can now use Lemma 8.34, Theorem 8.37, and Corollary 8.40 to show 


Corollary 8.41. All groups of order p" and 2p", where p is prime, are solvable. 


Proof. If G is a group of order p”, where p is prime, we will show that G is solvable using a 
proof by contradiction. Observe that if the result does not hold, then the Well Ordering 
Principle asserts that there is a smallest integer n > 0 such that there exists a group G with p” 
elements that is not solvable. Certainly groups with only one element are solvable, so we may 
assume that n > 1. As a result, Theorem 8.37 and Lagrange’s Theorem guarantee that 


|Z(G)| = p" > 1. 


Note that Z(G) is both a normal subgroup of G and a solvable group. When we examine the 
factor group G/Z(G), we have 


|G/Z(G)| = ied = pr" <p" 
|Z(G)| 


Group Theory 337 


As aresult, the minimality of n implies that G/Z(G) is solvable. Since G/Z(G) and Z(G) are 
both solvable, Corollary 8.40 tells us that G is indeed solvable, contradicting the assumption 
that G was not solvable. Thus, our proof by contradiction does indeed tells us that G is 
solvable. 


We now consider the case where G is a group of order 2p”, where p is a prime. If p = 2, then 
|G| =2"*!, and the first part of this result tells us that G is solvable. On the other hand, if 

p # 2, then Sylow’s Theorem tells us that G contains a subgroup H of order p”. By the first 
part of this result, we know that H is solvable. Furthermore, since the index of H in G is 2, 
Lemma 8.34 tells us that H is a normal subgroup of G. We can now form the factor group 
G/H, and since this group only has two elements, it is certainly solvable. Therefore, we are 
in the situation where both H and G/#H are solvable, so Corollary 8.40 asserts that G is 
solvable, as desired. 


To indicate the power and usefulness of Corollary 8.41, suppose you were interested in 
examining whether all groups of order 512 are solvable. It turns out that there are over 
8,000,000 isomorphism classes of groups of order 512. Therefore, it would be virtually 
impossible to do deal with this problem on a case by case basis. However, since 512 = 2”, 
Corollary 8.41 immediately tells us that every group of order 512 is solvable. This indicates 
that although some results in algebra may, at first, appear to be very technical or abstract, they 
often provide quick and easy solutions to problems that we would not want to deal with on a 
case by case basis. 


In order to prove that other classes of finite groups are solvable, we need two more tools. The 
first is 


Lemma 8.42. Let G be a group with finite subgroups H and K. 

(a) If|H| and |K| are relatively prime, then HM K = {e}. 

(b) If HK = {e}, then {hk |h € H,k € K} consists of |H|-|K\| different elements of G. 
Proof. For part (a), Proposition 8.6(c) tells us that HM K is a subgroup of both H and K. By 


Lagrange’s Theorem, |HM K| must divide both |H| and | K]|. Since || and | K| are relatively 
prime, it must now be the case that |HM K| = 1. Hence, HM K = {e}. 


For part (b), when we look at {hk | h € H,k € K}, there are certainly || choices for h and | K| 
choices for k. Therefore, there are |H|-|K| possible ways to form products of this type. 
However, we need to make sure that all || -|K| products do indeed produce different 
elements of G. To this end, suppose 1, hz € H and k,, kz € K such that 


hyky =hoko. 


338 Chapter 8 


By multiplying this equation on the left by 2~! and on the right by k,~', we obtain 
ho 'hy =kok,~! € HNK = {e}. 


However, since both hy~'h, and kyk,~! are equal to e, we immediately obtain h; = ho 
and k; = ko. Thus, different pairs of elements from H and K always produce different 
products in G. Thus, {hk | h € H,k € K} does contain || -|K| elements. 


Earlier in this section, Definition 8.32 and Proposition 8.33 gave us two equivalent ways to 
look at being solvable. As we commented at the time, it is frequently useful to have more than 
one way to look at a concept. Continuing with this theme, it also useful to have an additional 
way to look at being normal. The second part of our next lemma provides us with this. 


Lemma 8.43. Jf H is a subgroup of a group G and if a € G, then aHa~'! = {aha~' |h € H} 
is also a subgroup of G. Furthermore, H is normal if and only if aHa~! = H, for alla € G. 


Proof. If a € G, Proposition 8.6(a) tells us that in order to show that aHa™! is a subgroup, we 
must show that if x, y € aHa™', then xe! € aHa™!. Since x, ye aHa“', there exist gheH 

such that x = aga~! and y = aha™!. Recall that to find the inverse of a product, we invert each 
term and reverse the order. Therefore, 


y= @) ata! = ah an, 
Since H is a subgroup, gh~! € H, which implies that 
xy! = (aga~!)(aha~!) = ag(a'a)h a! = 
ageh'a~! = a(gh™!)a~! € aHa"!. 
Thus, aHa~! is indeed a subgroup. 


Next, if H is normal and a € G, then aH = Ha. Multiplying this equality of sets on the right 
by a7! results in 


(aH)a' = (Ha)a™'. 
However, Lemma 8.26 tells us that (Ha)a~'! = H(aa~!) = He = H, so aHa™! = H. 


In the other direction, suppose a € G and aHa~! = H. Multiplying this equality of sets on the 
right by a gives us 


(aHa~')a = Ha. 


However, Lemma 8.26(a) tells us that (aHa~!)a = (aH)(a~'a) = (aH)e = aH. As aresult, 
aH = Ha and AH is indeed normal. 


Group Theory 339 
We can now prove that another large class of groups is solvable. 


Theorem 8.44. /f G is a group of order pq, where p and q are prime, then G is solvable. 


Proof. If p = q, then |G| = p” and both Corollary 8.38 and Corollary 8.41 tell us that G is 
solvable. Therefore, we may assume that p  q. At this point, we let p denote the larger of the 
two primes and q the smaller. 


By Sylow’s Theorem, G contains a subgroup H with p elements. Suppose for the moment that 
G also contains a subgroup K such that |K| = p and H # K. Note that HN K is a subgroup of 
H, so Lagrange’s Theorem asserts that |HM K| equals 1 or p. However, if |HM K| = p, then 
HK isa subgroup of both H and K with the same number of elements as H and K. This 
tells us that HM K is equal to both H and K, contradicting the fact that H # K. Therefore, 
|HOK|=1. 


By Lemma 8.42(b), since HN K = {e}, G contains p* different elements of the form hk, 
where h € H and k € K. However, since p > q, this leads to the contradiction 

IG| = p* > pq=IGl. 
Therefore, G only contains one subgroup with p elements. 


Next, if a € G, consider aHa~! = {aga™! | g € H}; by Lemma 8.42(a), aHa™! is a subgroup 
of G. Furthermore, Lemma 8.17(a) asserts that aHa~! has the same number of elements as 
aH and aH has the same number of elements as H, so |aHa~'| = p. However, since H is the 
only subgroup of G with p elements, it follows that aHa~'! = H and Lemma 8.43 now tells 
us that H is a normal subgroup of G. 


We now have the following chain of subgroups: 
G D> AD fe}. 


Observe that H is normal in G and {e} is normal in H. Since |G/H| = q and H/{e} = p, both 
quotient groups have prime order and are therefore abelian. Therefore, the preceding chain 
satisfies the conditions of Definition 8.32, and G is indeed solvable, as required. 


In light of Theorem 8.44, we immediately know that groups of order 15, 21, and 35 are 
solvable. In fact, if we are interested in groups of small order, Corollary 8.40 and 

Theorem 8.44 handle most cases. For example, if we wanted to show that all groups of order at 
most 23 are solvable, the combination of these two results would handle all cases except for 
groups of order 12 and 20. However, our next result will handle these two cases. The proof 
will illustrate that proofs in finite group theory often split into many cases and might require 
the careful counting of the number of elements in various subsets and subgroups. 


340 Chapter 8 
Theorem 8.45. Jf G is a group of order 4p, where p is prime, then G is solvable. 


Proof. If p =2, then |G| = 2? and G is solvable by Corollary 8.40. Therefore, we may assume 
that p 4 2. Sylow’s Theorem now asserts that G contains subgroups H and K such that 
|H| = pand |K| =4. 


We now define the set N = {g € G| gHg~' = H}. If g € H, then gHg™! C H. However, using 
the argument used in the proof of Theorem 8.44, |gHg—'| = |H|. But this implies that 
gHg-'! =H, so g € N. As aresult, the set N contains H. 


We would like to show that N is a subgroup of G. Since N is a subset of the finite set G, 
Proposition 8.6(b) tells us that it suffices to show that N is closed under multiplication. To this 
end, if g,h € N, we have 


(gh) H(gh)~' = (gh) H(h~'g~') = g(hHh~')g"! = gHg™' = H, 


so gh € N. Now that we know that N is a subgroup with H C N CG, Lagrange’s Theorem 
tells us that the only possibilities for |N| are 4p, 2p, and p. 


If |N| = 4p, then N is equal to all of G. Therefore, gHg~! = H, for all g € G, and Lemma 8.43 
tells us that H is a normal subgroup of G. This means that we have the chain of subgroups 


G2H 2 {e}, 


where H is normal in G and {e} is normal in H. Furthermore, |G/H| = 4 and H/{e} = p, so 
both factor groups are abelian. This chain now satisfies the conditions of Definition 8.32, so, in 
this case, G is solvable. 


For the next case, let us suppose that || = 2p. Now we have the chain of subgroups 
GINDH 2D {e}. 


Since the index of N in G is 2 and the index of H in N is 2, Lemma 8.34 asserts that N is a 
normal subgroup of G and H is a normal subgroup of N. Certainly {e} is normal in H and the 
factor groups G/N, N/H, and H/{e}, respectively, have orders 2, 2, and p. Therefore, all of 
the factor groups are abelian, and the chain satisfies the conditions of Definition 8.32. Hence, 
in this case, G is solvable. 


For the final case, we are in the situation where || = p, so we know that N = H. We now 
define the set 


T ={aha'|he Hh #e,a€ K}, 


and we will be interested in determining the size of T. Since H has p— 1 elements other 
than the identity, and K has four elements, the set T is formed by computing 4(p — 1) products. 


Group Theory 341 


However, we need check whether all 4(p — 1) = 4p —4 of these products determine different 
elements of G. To this end, suppose g,  € A are not the identity and a, b € K such that 


aga! =bhb"'. 
Then we have 
b-'aga~'b =h, 
which implies that 
(b-'a)g(b-'a)! =h. 


Proposition 8.6(c) tells us that HM (b~'a) H(b~!a)7! is a subgroup of H. Since |H| = p, 
Lagrange’s Theorem implies that |HM (b-'a) H(b~'a)“'| is either 1 or p. However, we have 
seen that HM (b~!a)H(b~'a)~! contains both h and e. As a result, |H 9 (b-!a) H(b7!a)~!| 

> 1,s0|HN(b7!a)H(b-!a)~!| = p. Since HN (b-!a) H(b~!a)~!, (b-'a) H(b7!a)!, and H 
all contain p elements, it follows that H = (b~'a)H(b7'a)~!. Thus, b-!ae N. 


Since N = H anda, be K, we see that 
b'aeNNK=HNK. 


However, || and |K| are relatively prime, so Lemma 8.42(a) asserts that HM K = {e}. Thus, 
b-'a =e, which implies that a = b. 


The facts that a = b and aga! = bhb™' combine to tell us that g = h. Therefore, different 
pairs of elements of K and nonidentity elements of H do indeed produce different elements 
of T. As aresult, T consists of 4p — 4 different elements of G. Observe that every element of 
T belongs to a group of order p and is not the identity. Therefore, T does not contain any 
elements of order 1, 2, or 4. Since every element of K has order 1, 2, or 4, K consists of the 
four elements of G that do not belong to T. 


If a € G, let us look at the subgroup aKa™!. This group has order 4, so every one of its 
elements has order 1, 2, or 4. However, the previous argument indicates that all such elements 
must belong to K. Hence, aKa! = K, so K is normal. Therefore, we now have the chain of 
subgroups 


GOK 2 {e}. 


In our present situation, K is normal in G, {e} is normal in K, and the factor groups G/K and 
K/{e} have, respectively, orders p and 4. Therefore, the factor groups are abelian, and the 

chain satisfies the condition of Definition 8.32. As a result, in all possible cases, G is solvable, 
as desired. 


342 Chapter 8 


We conclude this section by formalizing our observation that all groups of order at most 23 are 
solvable. 


Corollary 8.46. If G is a group and |G| < 23, then G is solvable. 
Proof. Corollary 8.41 covers groups of order p” and 2p”, where p is prime. Therefore, we 
know that groups of order 

1,2,3,4,5, 6,7, 8,9, 10, 11, 13, 14, 16, 17, 18, 19, 22, 23 


are solvable. Since Theorem 8.44 covers groups of order pg, where p and q are prime, we also 
know that groups of order 15, 21 are solvable. All that remains are groups of order 12 and 20 
but, in light of Theorem 8.45, these are also solvable. 


Exercises for Section 8.3 


Before doing exercises 1-22, please read the following: 


In the proof of the class equation, given g € G, we looked at all elements of the form a~! ga, 
where a € G. We call elements of this form conjugates of g. The proof of the class 

equation examines the relationship between the number of conjugates of g and the size of 
C(g). Throughout these exercises, you will often need to take advantage of this relationship. 


In exercises 1-6, we use the notation for elements of $3 used throughout Sections 8.1-8.3. 
1. In $3, first determine the number of conjugates of f and then find all the conjugates of f. 
In 53, first determine the number of conjugates of g and then find all the conjugates of g. 
In S3, first determine the number of conjugates of / and then find all the conjugates of h. 


In S3, first determine the number of conjugates of j and then find all the conjugates of j. 


2 
3 
4 
5. In 53, first determine the number of conjugates of & and then find all the conjugates of k. 
6. Check that your answers to exercises 1—5 are consistent with the class equation. 

7. In Dag, first determine the number of conjugates of y and then find all the conjugates of y. 
8. In Dag, first determine the number of conjugates of x and then find all the conjugates of x. 
9 


In D4, first determine the number of conjugates of yx and then find all the conjugates 
of yx. 


10. In D4, first determine the number of conjugates of y? and then find all the conjugates 
of y’. 


11. 


12. 


13. 


14. 
15. 


16. 
17. 


18. 
19. 


20. 
21. 


22. 
23. 
24. 
25. 
20. 
27. 
28. 
29. 
30. 
31. 
32. 


Group Theory 343 


In D4, first determine the number of conjugates of yx and then find all the conjugates 
of y*x. 


In Dg, first determine the number of conjugates of y* and then find all the conjugates 
of y°. 


In Du, first determine the number of conjugates of y*x and then find all the conjugates 
of y>x. 


Check that your answers to exercises 7-13 are consistent with the class equation. 


In Qu, first determine the number of conjugates of —1 and then find all the conjugates 
of —1. 


In Qu, first determine the number of conjugates of i and then find all the conjugates of i. 


In Qu, first determine the number of conjugates of —i and then find all the conjugates 
of —i. 


In Qu, first determine the number of conjugates of j and then find all the conjugates of j. 


In Qu, first determine the number of conjugates of — j and then find all the conjugates 
of —j. 


In Qu, first determine the number of conjugates of ij and then find all the conjugates of i. 
In Qu, first determine the number of conjugates of —ij and then find all the conjugates 
of —ij. 

Check that your answers to exercises 15—21 are consistent with the class equation. 

In D,, determine how many conjugates y has and then find them. 

In D,, if n is odd, determine how many conjugates x has and then find them. 

In D,, if n is even, determine how many conjugates x has and then find them. 

Show that all groups of order 25, 26, 27, and 28 are solvable. 

Show that all groups of order 29, 31, 32, and 33 are solvable. 

Show that all groups of order 34, 35, 37, and 38 are solvable. 

Show that all groups of order 39, 41, 43, and 44 are solvable. 

Show that all groups of order 46, 47, 49, and 50 are solvable. 

Show that all groups of order 51, 52, 53, and 54 are solvable. 

Show that all groups of order 55, 57, 58, and 59 are solvable. 


344 Chapter 8 


In exercises 33-36, we let D,, be the nth dihedral group, where n > 4 is an even integer. We 
also let H =< y* >, the cyclic subgroup generated by y”. 


33. Find the order of H and the index of H in D,. 
34. Show that H is a normal subgroup of Dy. 
35. Ifwe let H, Hx, Hy, Hyx be the names of the four cosets corresponding to H, complete 


the multiplication table for the factor group D,,/#H. 
) A Ax Hy Hyx 


Ax 
Hy 
Hyx 


36. Based on your answer to exercise 35, is D,,/H isomorphic to C4 or to Cz x C2? 


In exercises 37-40, we examine the group Qu and let K =< —1 >, the cyclic subgroup 
generated by —1. 


37. Find the order of K and the index of K in Qu. 
38. Show that K is a normal subgroup of Qu. 
39. If we let K, Ki, Kj, Kij be the names of the four cosets corresponding to K, complete the 


multiplication table for the factor group Qu/K. 
° K Ki Kj Kij 


Ki 
Kj 
Kij 
40. Based on your answer to exercise 39, is Qu/K isomorphic to C4 or to Cz x C2? 


For exercises 41-50, if G is a group and c € G, we say that c is a commutator in G if 
c=a-b-a-!-b~!, for some a, b € G. We refer to the term a-b-a~!-b7! as the commutator of 
aand b. 


41. If Gisa group and a, b € G, show that the commutator a-b-a~!-b~! is equal to the 
identity if and only if a and b commute. 


42. In Qu, compute the commutator i- j-i7!- j7!. 


43. In Qu, compute the commutator ij -i- (es 


44. In Qu, compute the commutator j-ij-j~!- (ij)7!. 


Group Theory 345 


45. In $3, compute the commutator f-g- f~!-g7!. 


46. In $3, compute the commutator g-h- g7!-ho!. 
47. In $3, compute the commutator k- f-k7!. f7!. 
-1 


48. In $3, compute the commutator h- j-h7!- j 


49. Ifo: G, — G» is ahomomorphism of groups, show that if c € G; is a commutator then 
(c) is acommutator in Go. 


50. Show that the inverse of every commutator in a group is also a commutator. 


51. Let G be a group with normal subgroups H, K. If h € H andk € K, show that 
h-k-h7!-k-!€ HK. 


52. Let G be a group with normal subgroups H, K. If |HM K| = 1, show that hk = kh, for all 
heHandkeK. 


53. If Gis a group, show that Z(G) is a normal subgroup of G. 
54. If Gis anonabelian group of order p>, where p is a prime, show that |Z(G)| = p. 


For exercises 55-58, if G is a group with subgroup H, then the set N(H) = 
{g © G| gHg~! = H} is called the normalizer of H. 


55. If Gis a group with subgroup H, show that H is normal if and only if G = N(#). 
56. If G is a group with subgroup H, show that N(/7) is a subgroup of G that contains H. 
57. If G is a group with subgroup H, show that Z(G) C N(A). 


58. Suppose G is a nonabelian group with a subgroup H such that |N(H)| = p, where p is a 
prime. Show that the Z(G) contains only one element. 


59. Let 6: G; — G2 be a homomorphism of groups. Show that ¢ is injective if and only if 
Ker(@) = {e,}, where e, is the identity element of G1. 


60. Let 6: G; — Gp» be a homomorphism of finite groups such that |G | and |G] are 
relatively prime. Show that Ker(¢) = Gi. 


61. If: G, — Gz is asurjective homomorphism of groups and H is a normal subgroup of 
G}, show that @(/) is a normal subgroup of G2. 


At the end of Section 8.2, we presented a table that listed all groups, up to isomorphism, of 
order at most 11. We now have the tools needed to verify that our table is indeed complete for 
groups of order at most 11. Corollary 8.29 told us that any group of prime order must be 
cyclic. Combined with the fact that cyclic groups with the same number of elements must be 
isomorphic, we can see that the table is complete for groups of order 1, 2, 3,5, 7, 11. Next, 


346 Chapter 8 
Corollary 8.38 verifies the table is complete for groups of order 4 and 9. Therefore, it remains 
to examine groups of order 6, 8, and 10. 


In exercises 62-67, we will assume that G is a group of order 2p, where p is an odd prime. 
The classification of groups of order 6 and 10 will be a special case of the results in these 
exercises. 


62. Show that there exist x, y € G such that 
(a) o(%) =2, 


(b) o(y) =p, 
(c) <y> is normal, 


(d) every element of G can be expressed uniquely in the form y'x/, where 0 <i < p—1 
and 0 < j < 1, and 


(e) xyx= y”", where 1 <m<p-1l. 
63. Ifm= 1 in part (e) of exercise 62, show that G =< yx >, so G © C2p. 
64. Use part (e) of exercise 62 to show that y = xy""x. 
65. Use part (e) of exercise 62 to show that xy"x = y™. 
66. Use exercises 64 and 65 to show that m? — 1 is a multiple of p. 


67. Show that if m 4 1, then m = p—1andG~ D,. 


In exercises 68—76, we will assume that G is a group of order 8. These exercises will show that 
G must be isomorphic to one of the five groups on our table. In these exercises, we will let 
t(G) denote the largest integer m such that G contains an element of order m. 


68. If G is nonabelian, show that t(G) = 4. 

69. If t(G) = 8, show that G © Cg. 

70. If Gis abelian and t(G) = 4, show that G © C4 x C>. 
71. Ift(G) = 2, show that G ¥ Cy x C2 x Cd. 


72. If Gis nonabelian, show that 
(a) there exists some y € G such that o(y) = 4, 


(b) < y> is normal, 


(c) either (4) there exists x € G, x ¢< y > such that o(x) = 2, or (ii) o(x) = 4, for every 
xEeG,xé<y>. 


73. If (i) occurs in part (c) of exercise 72, show that every element of G can be written 
uniquely in the form yixt , where 0 <i<3and0<j<1. 


Group Theory 347 


74. If x, y are as in exercise 73, show that xyx = y? and G © D4. 
75. If (ii) occurs in part (c) of exercise 72, let x € G such that x ¢< y > and o(x) = 4. 
(a) Show that x? = y’. 
(b) Show that every element of G can be expressed uniquely as y«x’, where 0 < k <3 
and0 </< 1. 
(c) Show that xy = y*x. 


76. If x, y are as in exercise 75, let ¢: G > Qu be the function defined as @(y*x!) = i* j', for 
0<k <3 and0 </ <1. Show that ¢ is an isomorphism. 


In exercises 77-80, G will be a group of order pg, where p > q are primes and gq does not 
divide p—1. We will show that G must be cyclic. For example, we will have succeeded in 
classifying groups of order 15, 33, and 35. However, since g divides p— 1 when p=7,q=3 
and p = 11,q=5, these exercises do not fully classify groups of order 21 or 55. 


77. Show that there exist x, y € G such that 
(a) o(x) =4, 
(b) o(y) =p, 
(c) <y>is normal, 
(d) every element of G can be expressed uniquely as y'x/, where 0 < i < p—1 and 
O<j<¢q-l, 
(e) xyx7! = y”, where 1 <m < p—1. 
78. Ifx, y, mare as in exercise 77, show that y = y’”". 


79. Ifm is as in exercises 77 and 78, show that the order of [m], in U(Z,) is a divisor of both 
g and p—1. 


80. Use exercise 79 to show that m = 1 and conclude that G =< yx > and G © Coy. 


8.4 Symmetric Groups 


Symmetric groups are perhaps the most interesting and important class of groups in finite 
group theory. So far, virtually all of the finite groups we have looked at have been solvable. 
However, in this section, we will show that S, is not solvable, for n > 5. In fact, the nm = 5 case 
is a big part of the proof of the insolvability of the quintic. Not only can we think of S,, as 
being more complicated than other nonabelian groups like D,, but we will also show that S,, 
contains an isomorphic copy of every group of order n. Thus, it quickly becomes apparent that 
symmetric groups represent an essential piece of finite group theory. 


348 Chapter 8 


The good news when studying S,, is that there are some very convenient ways of representing 
its elements. We begin with 


Definition 8.47. If {a,, ao, ..., a;} is a subset of {1,2,...,n}, then we let 
(a, a2... ad) 
be a shorthand for the element f € S, such that 
F(a1) = 42, faz) =43,...,f(G@-1) =a, fla) =a1, 
and f(b) = b, for every other element of {1,2,...,n}. We refer to functions of this type as 


t-cycles. 


m Example 


The 3-cycle (1 43), when viewed as an element of S4, is a shorthand for the function f 
such that 


fiys4, (4 =3,. (6) =1,. and (2)=2. 


Note that we could also consider (1 4 3) as an element of Ss, in which it would also be 
the case that f(5) =5S. In fact, (1 43) can be considered as an element of S,, for all 
n > 4, and it would act as the identity on any m > 5. 


L__ | 
As we will see in the following examples, every element of S,, that can be represented by a 
t-cycle can actually be represented by ¢ different t-cycles. This is because we can begin the 
t-cycle with any of the ¢ terms that appears in the cycle. 

m Examples 
043)=43)=614), 
(67) = (76), 
(12345)=(23451)=(34512)=(45123)=61234). 
L_ a 


When cycles, such as (4 1 7), (3 8 2), and (5 6), have no elements of {1, 2, ..., 7} in common, 
we say that they are disjoint cycles. It is often the case in algebra that simple objects serve as 
the building blocks of more complicated objects. For example, prime numbers are the building 


www-.elsevierdirect.com 


Group Theory 349 


blocks of the natural numbers. In a similar manner, we can consider disjoint cycles to be the 
building blocks of S,, as every f € S, other than the identity can be written as a product of 
disjoint cycles. This is done by following what f does to each element of {1, 2,..., }. 


m Example 
Let f € Sjo be the function such that 
fY=8, fQ=2, fB)=9, fA) =10, fS)=1, 
fOB=5, (M=4, f¥%)=6, fO)=3, fU0) =7. 
Foranyi € {1,2,...,n}andoe€ S,, we can examine the list 


iPM oR a (iI)K- 


until the first repetition occurs. Every element of {1,2,...,} will appear exactly once on 

a list of this type and it gives us a shorthand for representing elements of S,. In this 

example, it allows us to abbreviate the action of f on {1,2,..., 10} as 
IP8BRO6OrS5Pr 1, 22, 


3R OP 3, 4 10R7P 4. 


Having broken f into four pieces, at first glance it appears that f consists of four disjoint 
cycles. However, we need not include the piece that tells us that f sends 2 to 2. 
Therefore, we can express f as a product of three disjoint cycles as 


f= 865)(3 9)(4 107). 


Observe that disjoint cycles always commute. However, in general, elements of S,, do not 
commute. 
m Examples 


1. Let g=(37125)andh= (432 1 6); recall that when we compose functions, we 
begin with the function on the far right. Therefore, as we examine how the function 
gh acts onthe set {1, 2,3, 4,5, 6, 7}, we see that 


(gh)(1)=6, (gh)(2)=2, (gh)\3)=5, (gh)(4) =7, 
(gh)(5) =3, (gh)(6)=4, (gh)(/) =1. 


www-.elsevierdirect.com 


350 Chapter 8 


Therefore, we can abbreviate the action of gh as 


lr 6rP 47h 1, 
2>2, and 3h 5P>3. 


As a result, the product (3 7 1 2 5)(43 2 1 6) can be now rewritten using disjoint 
cycles as 


(37125)(43216)=(1647)(35). 
2. Consider the product 
w = (1 4 3)(2 4)(2 3) 4 3) (2 4:3) (1 2 4); 


at first, it is not clear where w sends each element of {1, 2,3, 4}. However, when we 
compute the composition of these six functions, we can see that 


Ie 3rl, 2R4r2. 


Therefore, 


w=(143)2 42 3)1 43)243)0 24) = 3)(2 4). 


Observe that when we rewrite a product of functions in S, as a product of disjoint cycles, it is 


easy to see how the product acts on the elements of {1, 2, ...,}. Another advantage of using 
disjoint cycles is that it makes it easier to compute the order of the functions in S,,. We record 
this as 

Lemma 8.48 


(a) Ifan element g of a group G has order n > 1, then g is the identity if and only if m is a 
multiple of n. 


(b) Every t-cycle in S;, has order t. 
(c) If f € S, is written as the product of disjoint cycles f, f2--- fm, where each f; is a 


t;-cycle, then the order of f is equal to the least common multiple of t, t2, ..., tm. 


Proof. In one direction, if m is a multiple of n, then there exists g € Z such that m = qn. 
Therefore, 


Cae aC =e =< 


as desired. In the other direction, suppose that g” = e. The division algorithm tells us that 
there exist g, r € Z such that 


m=qn+r and O<r<n. 


www-.elsevierdirect.com 


Group Theory 351 


We now have 


m __ jgnt+r qn or 


eS PS ee Se es He ie Sele Ser =F. 


Since r <n, the minimality of n tells us that r = 0. Hence, m = qn, so m is a multiple of n, as 
needed. 


For part (b), if g = (i; i2 --- i,) is a t-cycle belonging to S,, then 


sidan, PG)=B4i,..., 8 i) =i Fi. 


As a result, the order of g cannot be less than ft. On the other hand, it is clear that g’ is the 
identity map, so g has order f. 


For part (c), let n be the least common multiple of t, f2,..., tm. In light of part (b), 7 is a 
multiple of the order of every f;. Therefore, part (a) tells us that f;” =e, for 1 <i <m. Since 
the j;’s are disjoint cycles, they all commute. As a result, we now have 


Peak py af fh fi’ Hee e=e. 


Since f” = e, in order to show that o( f) =n, it suffices to show that if 1 </ <n, then f! #e. 
To this end, since the f; commute 


Pee Ii) =f feta 


Since / is less than n, we know that / is not a multiple of some #;. In light of part (a), we now 
know that fj! 4 e. Therefore, there exists some j € {1,2,...,n} such that f/(j) =k & j. Since 
ti; fo, ---fm are disjoint cycles and the cycle f; must mention both j and k, it follows that 
none of the other cycles mention either j or k. As a result, every other cycle in f fixes both j 
and k. Therefore, when we look at 


FD = (fifa fm) A) = A fal fn); 
we can see that any term to the right of f;’ fixes j and any term to the left of f! fixes k. Thus, 


PAHFVEkFi 


so f! #e, as needed. 


To illustrate how to use Lemma 8.48, we revisit some of our earlier examples. 


Examples 
1. To compute the order of (3 7 125)(432 16), Lemma 8.48 and the fact that 


(37125)(43216)=(1647)85) 


352 Chapter 8 


tell us that 0((3 7 1 25)(43 2 1 6)) is the least common multiple of o((1 64 7)) =4 
and o((3 5)) = 2. Hence, 0((3 7125)(43216)) =4. 


2. lfw=(143)2 4)(2 3) 4 3)(2 4 3)C1 2 4), then since 
(143)2 4)(72 3) 43)243)0 24)=(13)24), 


we have that o(w) is the least common multiple of o((1 3)) = 2 and o((2 4)) =2. 
Thus, o(w) = 2. 


Another name for a 2-cycle is a transposition. As we will now see, if we look at S, from a 
somewhat different perspective, we can also view transpositions as the building blocks of S),. 
Since we already have concrete ways of representing elements in S,,, proofs about the structure 
of S,, often follow from some very short and simple formulas. That is certainly the case with 
our next proof. It will follow easily from a simple formula that, unfortunately, seems to appear 
from out of nowhere. However, the formula is actually the result of a great deal of 
experimentation and trial and error. Since none of this experimentation appears in the proof, 
the argument is quite short but does not add to our intuition. 


Lemma 8.49. Let n> 2; 
(a) every element of S, is equal to a product of transpositions and 


(b) if f € Sy, then foi jyo f-' = (fi) fj), for any 1 <i, j <n. 


Proof. For part (a), we have already observed that every element of S,, can be written as a 
product of disjoint cycles. Therefore, it suffices to show that every cycle can be written as a 
product of transpositions. To this end, consider the t-cycle (i, iz +++ ij ij41 +++ tr-1 i1)5 
we claim that 


(iy tg +++ Gy—1 tg) = Gy in) (in 3) + Gj tye) +++ G1 hh). 


If we let g be the function obtained by composing the ¢ — | transpositions from the previous 
equation, then it is clear that both (i; iz --- i,;-1 i,) and g act as the identity on any element of 
{1,2,...,} not mentioned in (i; iz --- i; i,). In order to show that (i; iz --- i;_1 i) and g 
are the same element of S,,, there are two remaining cases. First, when we compute g(i;), as 
we go from right to left in composing the ¢ — 1 transpositions, we can see that each 
transposition drops the subscript by 1 and 


Ip RO yep Ree bb 1h. 


Thus, g(i;) = 11, and we see that g and (i) i2 --- i+—1 i+) do the same thing to i;. 


Group Theory 353 


Next, if 1 < 7 < t—1, then 7; does not appear in any transposition until (7; i;,1) and this 
transposition sends 7; to i;41. However, i;+; does not appear in any of the transpositions to the 
left of (7; ij41), SO 7;41 remains fixed the rest of the way. As a result, g(i;) =1j+41, so g and 

(i, i2 +++ i;-1 i+) also do the same thing to alli;, where 1 < j < t—1. Thus, we have indeed 
shown that (i; i2 +++ i;—1 iy) and (i; i2)(i2 13) --- @j i741) +++ (i-1 tr) are the same element 

of Sy. 


For part (b), we must examine what the composition fo (i j)o f —! does to every element of 
{1,2,...,}. First, if 1 <k <n and k is not equal to either f(i) or f(j), then f—'(& is not 
equal to either i or j. Therefore, the transposition (i j) fixes f—!(k) and the composition 
foi jo f—! does the following to k: 


hie fhe Pek 
On the other hand, the composition fo (i j)o f —! does the following to f(): 
fOr ir jr fd. 
Similarly, f 0 (i j)o f—! does the following to f(j): 
fr jrir fo. 


The preceding computations show that fo (i j)o f~! interchanges f(i) and f(j) while fixing 
every other element of {1, 2, ..., 2}. However, this is exactly what the transposition (f(i) f(j)) 
does, so fo(i j)o f-! = (fi) f(j)), as desired. 


To illustrate the formula we verified in part (a) of Lemma 8.49, we have 


m Example 
(3321475)=@62)2 1) 4)(47)(75) and 
(86241937)=(8 6)(6 2)(24)4 IU 99 3)G 7). 
These combine to tell us that 
(321475)(86241937)= 
(3 2)22 I) 44 7)(7 5)(8 6) (6 2)(2 4)(4 DA 99 3)G 7). 


When we showed that prime numbers were the building blocks of the natural numbers, we 
showed that the representation of each n > 2 as a product of primes was unique up to order. 
Similarly, when expressing an element of S;, as a product of disjoint cycles, this representation 


354 Chapter 8 


is also unique up to the order of the cycles and the fact that every f-cycle can be written using 
any of f terms from the cycle in the first position. However, there is no analogous statement 
regarding uniqueness when it comes to representing cycles as products of transpositions. For 
example, we could replace the formula 


(iy ig +++ tg—1 ty) = Cy in) 2 13) +++ Gj tjea) + G-1 bt) 
in the proof of Lemma 8.49(a) by the formula 
Gy ip hat 1) = Geripio i): Gp a: 


This formula is another example of a fact about S,, which, although straightforward to verify, 
takes a great deal of experimentation to find. If we use this new formula, we now have 


(321475)=(75)(45)(15)(25)G 5) 
and 
(86241937)=(37)97)C 7)(47)(2 7)(6 7)(8 7). 


Our next lemma is another example of how short formulas in S, can be exploited to obtain a 
good deal of information about S,,. 


Lemma 8.50. Jf H is a subgroup of S, that contains (1 2) and (123 --- n), then H = Sj. 


Proof. Lemma 8.49(a) tells us that every element of S,, is a product of transpositions, so it 
suffices to show that H contains all transpositions. 


Every transposition in S, can be written as (i j), where 1 <i < j <n. We will first show that 
H contains all transposition of the form (i i+1), where 1 <i<n—1. When i = 1, there is 
nothing to prove, so we may assume that 7 > 1. If we let g= (123 --- n) and let f = g'!, 
then we can see that f € H and 


fGd)=i and f2)=i+1. 
Since H is a subgroup, we have f o(1 2)o f—! € H and Lemma 8.49(b) tells us that 
fo(l2)of-' =(f) f2) =Gi+D. 
The preceding computations tell us that H contains the transpositions 


(12), 23), GB4),..., (2-1 n). 


Group Theory 355 


We need to finish showing that H contains all transpositions of the form (i j), where 
1 <i < j <n. Observe that we have already handled the case where j = i+ 1, so we may 
assume that j > i+ 1. If we let 


h=(G-1))---@4+2i1+3)G4+17+2), 
then h € H, and it is not hard to see that 
A(@i)=i and hG+l1)=). 
As a result, 
ho(iitloh '€H 
and Lemma 8.49(b) asserts that 


ho(iitl)oh ! = (h(i) hGi+1)) =(i j). 


Thus, H does indeed contain all transpositions in S,, so H = S,, as desired. 


The next result is one of two in this section that will play a key role in the proof of the 
insolvability of the quintic. 


Theorem 8.51. If H is a subgroup of Sp which contains both a transposition and an element 
of order p, where p is prime, then H = S». 


Proof: In light of Lemma 8.50, it suffices to show that H contains (1 2) and (1 23 --- p). The 
transposition in H flips two elements of {1,2,..., p} and leaves the other p—2 elements 
fixed. Without loss of generality, we may rename the two elements flipped by the transposition 
as 1 and 2. Therefore, we may now assume that H contains (1 2). Note that we still reserve 
the right to rename, if necessary, the other p — 2 elements of {1,2,..., p}, provided we do not 
use the names | or 2. 


Let f € H have order p; by Lemma 8.48(b)(c), f must be a product of disjoint p-cycles. 
However, since each p-cycle mentions every element of {1,2,..., p}, it follows that f can be 
represented by a single p-cycle. 


When we represent f as a p-cycle, we can choose the cycle that begins with the number 1. 
Therefore, 


FS 3nd «35, 


Next, let j represent the number of elements from {1,2,..., p} that appear between 1 and 2 
in f. For example, if 


PERE TAD), 


356 Chapter 8 


then j = 3. It is easy to see that 0 < j < p—2, and if we let g= f/t!, we will examine some 
of the properties of g. First, g is also an element of both H and the cyclic subgroup 
generated by f. Since f has order p and 1 < j+1 < p—1, we can see that g= f/t! is not 
the identity. Note that since every element of < f > other than the identity has order p, it 
follows that g has order p. However, we already saw that every element of S, of order p can 
be represented by a p-cycle. 


The p-cycle representing f has j terms between 1 and 2, so 
gd) = fe) =2. 
As aresult, when we write g as a p-cycle, we can write 
g=(12---). 


The p— 2 terms in the cycle for g that are to the right of 2 all belong to the set {3,..., p}. 
However, as we remarked earlier, we reserved the right to rename the elements of this set with 
other elements of the set. Therefore, we may assume that the terms in our p-cycle for g 
following 2 are, in order, 3,4,..., p. As a result, 


g=(123--- p). 


Now that we know that H contains both (1 2) and (1 23 --- p), we can apply Lemma 8.50 to 
assert that H = S,,, as desired. 


Our next lemma is yet another example of a significant fact about S,, that, once you have found 
the right formula, is quite easy to prove. 


Lemma 8.52. Let H, K be subgroups of S,, where n > 5, such that 

(a) HA contains all 3-cycles and 

(b) ghg'h7! €K, forall g,he H. 

Then K also contains all 3-cycles. 

Proof. It suffices to show that, if f € S, is a 3-cycle, then there exist g, h € H such that 

f = ghg7'h7'. To this end, since f is a 3-cycle, there exist three different elements in 
{1,2,...,}, which we will call a,, az, a3, such that f = (a; az a3). Since n > 5, there are at 


least two elements in {1, 2,...,} that are not in the set {a1, do, a3}, and we will call them aq 
and as. If we let 


g=(a, a2 44) and h= (a) a3 4s), 


Group Theory 357 
then both g and h are 3-cycles and therefore belong to H. Since 
(a, az a4)! = (ay ag.az) and (a a3 as)~' = (ay as a3), 
a simple computation tells us that 


ghg'h~' = (ay ap a4)(ay a3 as)(a a2 a4)~' (ay a3 as)' = 


(a1 a2 a4) (a1 a3 A5)(A a4 A2)(A1 a5 a3) = (a| a2 a3) = ff 


Thus, f € K,so K does indeed contain all 3-cycles, as desired. 


We can now prove the important fact, which we mentioned at the beginning of this section, 
that S,, is not solvable when n > 5. 


Theorem 8.53. [fn > 5, then S, is not solvable. 


Proof. Suppose S,, contains a chain of subgroups 
Sn = Go > G, > G2--- D Gm_-| > Gin 
such that whenever g, h € G;, we have ghg'h7! € Gi41, forO <i<m-—1. 


Lemma 8.52 tells us that whenever G; contains all 3-cycles, then so does G;+1, for 

0 <i<m-—1. Since Go = S,, it contains all 3-cycles, so every subgroup on this chain also 
contains all 3-cycles. If S,, were solvable, there would exist some chain of this type where 
Gi» = {e}. But since G,, must contain all 3-cycles, G,, 4 {e}, so S, is not solvable. 


At the beginning of this section, we said that we would prove for any n > 1 that every group 
of order 7 is isomorphic to a subgroup of S,. This reinforces the notion that symmetric groups 
are an extremely rich and interesting collection of groups. It might lead you to believe that 
symmetric groups are the only finite groups you need to study, as a complete understanding of 
the subgroups of symmetric groups would allow you to understand the structure of all finite 
groups. For example, suppose you wanted to understand all groups G such that |G| = 11. It 
will follow from the result we are about to prove, that G is isomorphic to a subgroup of $j1. 
However, S$); contains 


11! = 39, 916, 800 


elements. Trying to find and understand the structure of all subgroups of a group with 

39, 916, 800 elements is an incredibly difficult job. On the other hand, since 11 is prime, 
groups of order 11 are very easy to understand, as up to isomorphism there is only one. Thus, 
as a practical matter, there are usually easier and more direct ways to examine a group of order 
n than using the fact that an isomorphic copy of it is contained in Sp. 


358 Chapter 8 


Theorem 8.54—Cayley’s Theorem. /f G is a group of order n, then S, contains a subgroup 
isomorphic to G. 


Proof. Since G is a set with n elements, we can think of S, as the set of bijections from G to 
G. We begin by associating to every element of G an element of S,,. If g € G, define the 
function 


Te:G>G 
as 1 ,(a) = ga, for all a € G. Certainly z, is a function from G to G and we need to show that 
it is a bijection. 
In order to show that zr, is a injection, suppose a, b € G such that 7,(a) = m,(b); we must 
prove that a = b. However, 


ga = 1(a) = 14(b) = gb 
and multiplying the outer terms of this equation on the left by g~! yields 
g-'(ga) = g (gb), 
from which it quickly follows that a = b. Thus, 7, is injective. 


There are two ways to show that zr, is a surjection. First, since G is finite, any injection of G is 
also a surjection. However, we can also directly prove that zr, is surjective without using the 
fact that G is finite. Observe that if a € G, then 


mg 'a) = 9(g_'a) =(gg')a=ea=a. 


Therefore, every a € G belongs to the image of zg, so 7g is a surjection. As a result, for all 
g €G, Tz is a bijection. 


We can now define the function 
:G—> SS, 


as b(g) = 7g, for all g € G. We first need to show that ¢ is ahomomorphism. If g,h € G, 
observe that 


P(gh)=Ten and (g)b(h) = Met. 


Both zy and 7,7), are functions from G to G. To check that they are the same function, if 
aeéG, we have 


Ten (a) = (gh)a = g(ha) = g(p(a)) = We (Tp (A)) = (Wen) (A). 


Thus, gh = Wein, 80 (gh) = o(g)P(h), so ¢ is indeed a homomorphism. 


Group Theory 359 


Since ¢ is ahomomorphism, Theorem 8.39 asserts that Jm(@) is a subgroup of S,. We can 
think of @ as being a homomorphism from G to /m(@). In this context, ¢ is certainly surjective, 
and in order to conclude that G is isomorphic to Jm(@), it suffices to show that ¢ is injective. 
To this end, suppose g, h € G such that 6(g) = @(h); we need to show that g = h. Since 


Te = o(g) = $(1) = Th, 


7, and 7, are the same function from G to G. If we apply both functions to e € G, we obtain 
T(e)=ge=g and m(e)=he=h, 


SO g=71g(e) = 1), (e) =h. Thus, ¢ is also injective. 


Therefore, ¢ : G — Im(@) is not only a homomorphism but also a bijection. As a result, G is 
isomorphic to Jm(@) and Im(@) is a subgroup of S,,, as required. 


In Section 8.2, we remarked that the converse of Lagrange’s Theorem does not hold. In other 
words, there exists a group G of order n that does not contain any subgroup of order m, for 
some m that divides n. The smallest example is a group G of order 12, which is contained in 
S4, that contains no subgroup with six elements. We now present this example and show that it 
has the desired properties. 


Proposition 8.55. $4 contains a subgroup G of order 12 that does not contain a subgroup of 
order 6. 


Proof: Let 


G ={e, (123), (132), (124), (142), (134), (143), (234), 
(2243), 12)34), 1 3)24), 1 4)2 3)}: 


it is somewhat tedious but not hard to check that G is closed under multiplication. Therefore, 
by Proposition 8.6(b), G is indeed a group with 12 elements. 


By way of contradiction, suppose G contained a subgroup H with six elements. By Sylow’s 
Theorem, there must exist g, h € H such that o(g) = 2 and o(h) = 3. Lemma 8.42(b) now 
asserts that the set 


(gh! |0<i<1,0<j<2} 


contains six different elements of H, so this set is equal to H. 


360 Chapter 8 


Since |o(g)| and |o(A)| are relatively prime, < g > <h > = {e}. Therefore, ifhg € <g>, 
we would obtain the contradiction 

he<g>N<h>= {e}. 
Similarly, if hg €< h >, then we would obtain the contradiction 

geE<g>N<h>= fe}. 


However, we know that hg € H, so out of the six elements in H, the only remaining 
possibilities are that either hg = gh or hg = gh’. 


If hg = gh, then H is abelian, and it is easy to see that o(gh) = 6. However, when we look at 
the list of elements of G, we can see that every element of G has order 1, 2, 3, or 4. This is a 
contradiction, so we may now assume that hg = gh’. 


The only elements of order three in G are 3-cycles, so h is a 3-cycle. Observe that not only 
does G contain every 3-cycle in Sy, but it also contains every product of two disjoint 
transpositions in S4. In light of this, by appropriately naming the four elements of the set that 
the functions in $4 permute, we may assume that h = (1 2 3). On the other hand, g has order 2, 
and G only contains three elements of order 2. As a result, g must be equal to one of 

(1 2)G6 4), 1 3)(2 4), or 1 4)(2 3). 


By multiplying the equation hg = gh’ on the left by g, we obtain the equation ghg = h”. We 
now examine the values of ghg that occur when h = (1 2 3) and g ranges through the three 
possibilities (1 2)(3 4), (1 3)(@ 4), and (1 4)(2 3). If g = (1 2)G 4), we obtain 


ghg = ((1 2)(3. 4))(1 2 3)((1 2) 4)) = (14.2) (13 2) =’; 
if g = (1 3)(2 4), we obtain 

ghg = ((1 3)(2 4))(1 2. 3)((1 3)(2 4) = (1 3.4) A (13.2) Shh’; 
and finally, if g = (1 4)(2 3), we obtain 

ghg = ((1 4)(2 3))(1 2 3)((1 4) (2 3)) = (243) A (13.2) =A’. 


Therefore, regardless of our choice of g, we obtain the contradiction ghg # h?. As a result, G 
cannot contain a subgroup H with six elements. 


We conclude this chapter by verifying the remark we made in Section 8.3 that S4 is solvable. If 
G is the subgroup of $4 from Proposition 8.55, then | $4| = 24 and |G| = 12. Since G has 
index 2 in S4, Lemma 8.34 asserts that G is a normal subgroup of S4. Observe that the quotient 
group S4/G has order 2, so it is solvable. In addition, it follows from Theorem 8.45 that 
groups of order 12 are solvable, so G is solvable. As a result, both S4/G and G are solvable, 
and Corollary 8.40 now asserts that S4 is indeed solvable. 


Group Theory 361 


Exercises for Section 8.4 


1. 


2D: 


In S,, show that the inverse of the t-cycle (a; a2 --- a;—1 a;) is the t-cycle 


(A; G1 +++ G2 Q}). 


If n > 1, what is the order of the group S,? 


Exercises 3-41 will deal with functions f g,h, k € Sg. The behavior of f g, h, k on the set 
{1,2,3,4,5, 6, 7, 8} is described by the following. 


fM=3, fO=5, fQ=8, fa=6, 
fO)=2, fO=1, fM=7, fS)=4. 


gdy=1, g(2)=6, g(3)=8, g(4)=5, 
9(5)=4, g(6)=7, g(7)=2, 9(8)=3. 


hQ)=8, hAQ=1, hB)=5, h(4) =6, 
h(5)=2, h(6)=4, h(7)=3, h(8)=7. 


kQ)=5, kQ)=3, k3)=8, k(4=1, 
k(5)=4, k(6)=6, k(7)=7, k(8) =2. 


Write f as a product of disjoint cycles and find the order of /. 
Write g as a product of disjoint cycles and find the order of g. 
Write h as a product of disjoint cycles and find the order of h. 
Write k as a product of disjoint cycles and find the order of k. 
Write f—! as a product of disjoint cycles. 

Write g~! as a product of disjoint cycles. 

Write h~! as a product of disjoint cycles. 

Write k~! as a product of disjoint cycles. 

Write f? as a product of disjoint cycles and find the order of f*. 
Write g? as a product of disjoint cycles and find the order of g?. 
Write h? as a product of disjoint cycles and find the order of h?. 


Write k? as a product of disjoint cycles and find the order of k?. 


362 Chapter 8 


15. Write fog as a product of disjoint cycles and find the order of fo g. 
16. Write go f as a product of disjoint cycles and find the order of go f. 
17. Write f oh as a product of disjoint cycles and find the order of f oh. 
18. Write ho f as a product of disjoint cycles and find the order of ho f. 
19. Write fok as a product of disjoint cycles and find the order of fok. 
20. Write ko f as a product of disjoint cycles and find the order of ko f. 
21. Write goh as a product of disjoint cycles and find the order of g oh. 
22. Write hog as a product of disjoint cycles and find the order of ho g. 
23. Write gok as a product of disjoint cycles and find the order of gok. 
24. Write kog as a product of disjoint cycles and find the order of ko g. 
25. Write hok as a product of disjoint cycles and find the order of h ok. 
26. Write koh asa product of disjoint cycles and find the order of k oh. 


27. Inthe exercises above, you have seen many examples where a, b € Sg and ab ¥ ba. In 
the cases you examined, how do o(ab) and o(ba) compare? 


28. Let G bea group and suppose a, b € G such that (ab)” = e, for some n > 1, where e is 
the identity element of G. 
(a) Show that (ba)"~'!b=a™!. 


(b) Use part (a) to show that (ba)” = e. 
(c) How do o(ab) and o(ba) compare? 
(d) Is the answer from part (c) consistent with your observations in exercise 27? 
29. Write f two different ways as a product of transpositions. 
30. Write g two different ways as a product of transpositions. 
31. Write h two different ways as a product of transpositions. 
32. Write k two different ways as a product of transpositions. 
33. Write f* two different ways as a product of transpositions. 
34. Write g* two different ways as a product of transpositions. 
35. Write h* two different ways as a product of transpositions. 


36. Write k? two different ways as a product of transpositions. 


37. 


38. 
39. 
40. 
41. 
42. 
43. 


44, 


45. 


46. 
47. 
48. 
49. 


50. 


51. 


52. 


53. 
54. 


Group Theory 363 


If o1, 02, ..., 0; are transpositions, show that the inverse of the composition 
01002 0+++00;_] 00; 1S 0; 007_] 0°**007 00}. 


Use exercise 37 to write f—! in two different ways as a product of transposition. 


Use exercise 37 to write g~! 


in two different ways as a product of transposition. 
Use exercise 37 to write h~—' in two different ways as a product of transposition. 
Use exercise 37 to write k~! in two different ways as a product of transposition. 
If (abc) isa 3-cycle in S, and o € S,, show that 00 (abc) oa! = (a(a) a(b) o(c)). 


Generalize exercise 42 and show that if (a; a2 --- a;) isa t-cycle in S, ando € S,, then 
00 (a) a2 «+» a))007' = (a(a)) o(az) «++ o(a;)). 

The groups S4 and Dj2 are both nonabelian groups of order 24. 

(a) What is the largest order of an element in $4? 

(b) What is the largest order of an element of Dj? 

(c) Are S4 and D,2 isomorphic? Explain your answer. 


The group G in the proof of Proposition 8.55 and Dg are both nonabelian groups of order 
12. Use the fact that G does not contain any subgroup with six elements to show that G 
and D6 are not isomorphic. 


For which n € N does S4 contain a cyclic subgroup with n elements? 
For which n € N does Ss contain a cyclic subgroup with n elements? 
For which n € N does S¢ contain a cyclic subgroup with n elements? 


If 01, o2 are different transpositions in S, which are not disjoint, show that 0; 002 isa 
3-cycle. 


If i, j,k, / are distinct positive integers, write the composition (i k j) 0 (ik 1) as a product 
of disjoint cycles. 


Use exercise 50 to show that if 01, 02 are disjoint transpositions, then there exist 3-cycles 
T, T such that tT, oT = 01 003. 


Use exercises 49 and 51 to show that any product of an even number of transpositions 
can be rewritten as a product of 3-cycles. 


Find the smallest n € N such that S, contains an element of order 10. 


Find the smallest n € N such that S, contains an element of order 15. 


364 Chapter 8 


55. Find the smallest n € N such that S, contains an element of order 16. 
56. Find the smallest n € N such that S,, contains an element of order 30. 


In exercises 57-69, you will be asked to prove various facts about the group G from the proof 
of Proposition 8.55. At first glance, these exercises may appear to require a good deal of 
computation. However, if you use Lagrange’s Theorem, Sylow’s Theorem, and the fact that G 
does not contain any subgroups with six elements, relatively little computation should be 
necessary. These exercises will refer to the normalizer of a subgroup, which was a concept 
introduced in exercises 55-58 in Section 8.3. 


57. Without finding all the subgroups of G, for which n € N does G contain a subgroup of 
order n? 


58. If g,h € G such that o(g) = 2 and o(h) = 3, show that gh # hg. 
59. If g,h € G such that o(g) = 2 and o(h) = 3, show that gh #h7g. 


60. Ifh € G such that o(h) = 3, show that subgroups < h >, C(h), and N(<h >) are all 
the same. 


61. By looking at the number of elements of order 2, show that G has exactly one subgroup 
with four elements. 


For the remainder of these exercises, K will denote the unique subgroup of G with the four 
elements referred to in exercise 61. 


62. If g<€G such that o(g) = 2, show that C(g) = K and N(K) =G. 


63. If g eG has order 2, determine the number of conjugates of g and write down the 
elements of G that are conjugate to g. In light of exercises 61 and 62, you should not 
need to do any computations inside of G. 


64. Ifhe€G has order 3 and a, b are distinct elements of K, show that a~'ha # b~'hb. 


65. Ifh, € G has order 3 and hz € G is conjugate to h;, show that there exists some a € K 
such that hy =a7~!hya. 


66. If h € G has order 3, show that h and h? are not conjugate. 


67. Find all conjugates of the 3-cycle (1 2 3). Exercise 65 can reduce the number of 
computations needed. 


68. Find all conjugates of the 3-cycle (1 3 2). Exercise 65 can reduce the number of 
computations needed. 


69. Look at your answers to exercises 63, 67, and 68, and check to see if they are consistent 
with the class equation. 


Polynomials over the Integers 
and Rationals 


In your previous algebra courses, many problems and questions probably arose that dealt with 
the roots and factoring of polynomials in Z[x] and Q[x]. We are now in a position to apply 
our work on Z and Z, from Chapters 3 and 7 to deal with many of these same problems. 
Although the focus of this chapter will be Z[x] and Q[x], occasionally we will work in the 
more general context of fields and commutative rings. Hopefully, by continuing to look at 
concrete examples, you will become more and more comfortable working with fields and 
commutative rings. 


9.1 Integral Domains and Homomorphisms of Rings 


When asked to multiply polynomials consisting of several terms, students often forget that the 
key to multiplying polynomials is the distributive law. Names like “FOIL” are sometimes 
given in certain situations to help students remember a procedure. However, it is much more 
important to realize that the distributive law can always be used, regardless of how many 
terms appear in the polynomials that are being multiplied. 


m Example 
(3x7 +2x+ 8)(4x° —6x°7+5x) = 
3x?(4x° — 6x3 +5x)+ 2x(4x° — 6x + 5x) + 8(4x° — 6x7 + 5x) = 
(12x? — 18x° + 15x") + (8x° — 12x* + 10°) + G2x? — 483° +40x) = 


12x? + 8x°+ 14x° — 12x+ — 33x3 + 10x” + 40x. 
| | 


More generally, the distributive law tells us that when multiplying the polynomials 


f(x) = anx" dg eee tay xt ag 


Copyright © 2010 by Elsevier Inc. All rights reserved. 365 


366 Chapter 9 


and 
g(x) = Dinx"™ + ne ile +---+b\x+ bo, 


the coefficient of x’ in f(x)- g(x) is a;- bo +a;_1-by +--- +a, -b;-1 +a0- hj. 


Unusual things can happen if the coefficients of our polynomials belong to a ring with zero 
divisors. Recall that if Z,, has zero divisors, then the degrees of polynomials need not add 
when we multiply in Z,,[x]. For example, in Zjo9[x] we have 


([4] 10x” + [3]10x° + [8]10x) (5]10x7) = [20] 10x? + [15] 10x° + [40] 0x7 = 


[Ol 10x? + [S]10x° + [O]1ox? = [5]10x°. 


However, at the moment, we are primarily interested in understanding polynomials in Z[x] 
and Q[x]. Therefore, in this chapter, we will usually look at commutative rings without zero 
divisors. 


Definition 9.1. A commutative ring R is called an integral domain if it has no zero divisors. 


m= Examples 


Some examples of integral domains are Z, Q, R, C, Z[x], R[x], CLx], R[x, y]. Also, when 
p is a prime number, Z, and Z,[x] are also integral domains. In particular, every field is 
an integral domain, and whenever R is an integral domain, then so is R[x], the set of 
polynomials with coefficients in R. 


Before listing some of the basic properties of multiplication in R[x], for integral domains R, 
we need some terminology. 


Definition 9.2. Let f(x) = a,x" + dy—1x"—! +++ +n 41 x"*! +ayx” be a polynomial 
where Gn, dm #0 and n => m. We call a,x" the leading term of f(x) and a,x" the trailing 
term of f(x). In this case, we also call a, the leading coefficient of f(x) and a» the trailing 
coefficient of f(x). Finally, we say that n is the degree of f(x), and we write this as 


deg( f(x)) =n. 


Polynomials over the Integers and Rationals 367 


m Examples 


If f(x) = 23x!4 — 6x9 + tn — 4x3 in Q[x], then the leading term Is 23x!4 the leading 
coefficient is 23, the trailing term is —4x3\ the trailing coefficient is —4, and 


deg( f(x)) = 14. 


In Zs[x], if g(x) = [4]5x8, then [4]5x° is both the leading and trailing term and 
deg(g(x)) = 8. 
Le | 


We can now record some standard facts about the multiplication of polynomials with 
coefficients in an integral domain. 


Lemma 9.3. Let 

IO) Saat +a tte tae 
and 

e(x) = bnx” + bya”) tet bax! +b? 


be polynomials with coefficients in an integral domain where an, as, bm, by #0, n > s, and 
m >t. Then 


(a) the leading term of f(x) + g(x) is Anbmx"t™ and deg( f(x): g(x)) =n+m; 
(b) the trailing term of f(x) - g(x) is asb:x°"'; 


(c) f(x)- g(x) consists of only one term with a nonzero coefficient if and only if both f(x) and 
g(x) consist of only one term with a nonzero coefficient. 


Intuition. Although multiplying polynomials is a straightforward procedure, computing all 
the terms in the product of two polynomials can sometimes be a long, tedious job. However, 
it is always quick and easy to find both the leading and trailing terms of a product if the 
coefficients of our polynomials belong to an integral domain. In the product f(x) - g(x), the 
leading term is the product of the leading terms of f(x) and g(x) and the trailing term is the 
product of the trailing terms of f(x) and g(x). For example, consider 


f(x) = 3x!? + 6x! — 49° +.2x4 — 8x? and g(x) = 4x°—11x° + 9x4 


in Q[x]. We can immediately see that the leading term of f(x) - g(x) is (3x!7)(4x°) = 12x!8 
and the trailing term is (—8x3)(9x*+) = —72x’. 


368 Chapter 9 


Proof. Tf 
FO) = Gy" agai ee bag bag 
and 
BC) = bak + bee es hie es, 


then the largest exponent that appears in f(x) is n and the largest exponent that appears in g(x) 
is m. Therefore, the largest exponent that could possibly appear in f(x) - g(x) ism-+m. The 
only way to obtain a term of degree n +m in f(x) - g(x) is by multiplying the a,x” term from 
f(x) and the b,x" term from g(x). Since a, and b,, are nonzero elements of an integral 
domain, the product (a).x") (bmx) = anbmx"t™ is nonzero. Thus, the term a,b,x"t™” does 
indeed appear in f(x) - g(x) with a nonzero coefficient, and it is therefore the leading term of 
F(x): g(x). In addition, this also tells us that deg( f(x) - g(x)) =n-+m, thereby proving part (a). 


The proof of part (b) is very similar. The smallest exponent that appears in f(x) is s and the 
smallest exponent that appears in g(x) is t. Therefore, the smallest exponent that could 
possibly appear in f(x) - g(x) is s +t. The only way to obtain a term of degree s+ ¢ in 

f(x) - g(x) is by multiplying the a,x* term from f(x) and the b,,x” term from g(x). Since a, 
and b; are nonzero elements of an integral domain, the product (a,x*°)(b;x') = asb,x**" is 
nonzero. Thus, the term a,b,x*t' does indeed appear in f(x) - g(x) with a nonzero coefficient, 
and it is therefore the trailing term of f(x) - g(x). 


For part (c), one direction is quite easy. If both f(x) and g(x) consist of a single nonzero term, 
then f(x) = a,x" and g(x) = bmx”. Thus, f(x) + g(x) = dnbmx"*™, so f(x) - g(x) certainly 
consists of a single nonzero term. In the other direction, if f(x) - g(x) consists of only one 
nonzero term, then the leading term a,b,,x"t™ and the trailing term asb,x*t' must be the same 
term. This tells us that n +m = s-+t. However, since n > s and m > t, this can only happen if 
n=sandm=t. As aresult, f(x) = a,x" and g(x) = b,x”, as desired. 


Observe that part (a) of Lemma 9.3 also tells us that if f(x) and g(x) are both nonzero, then 
f(x) - g(x) #0. That is the reason why R[x] is an integral domain whenever R is an integral 
domain. 
m Examples 
In Zy [x], if 

fx) = (6x? + 00hix* + 2hix 


and 


e(x) = [8x + [5]ipx8 + [8]irx° + 19] 112%, 


Polynomials over the Integers and Rationals 369 


then the leading term of f(x) - g(x) is ((6]11.x7)((8}41x!4) = [4] 1x7! and the trailing term is 
(Zine lia = (The: 


In Zo9[x], if 
f(x) = [25]n9x* + [4]o0x7 + [15]29x° 
and 


g(x) = [6]29x~* + [17 ]29x'? + [20]o9x'? + [S]29x"° + [21 ]29x”, 


then the leading term of f(x) - g(x) is ((25]20x°) ([6]29x7*) = [5]29x°* and the trailing term 
is ({15]29x°)([21]29x?) = [25]29x!*. 


If we revisit an example from earlier in this section, we can see why our coefficients need 
to belong to an integral domain for the conclusions of Lemma 9.3 to hold. Recall that in 
Zo0[x] we had 


([4]}10x’ + [3]}10x° + [8]}10x) (5]10x7) = [20] 10x? + [15]10x° + [40]10x° = 
[0] 10x? + [5]10x° + [O0]10x° = [S]10x°. 
Observe that in this example the conclusions in Lemma 9.3 do not hold. In particular, 
[5]10x° has degree 5, whereas the polynomials being multiplied have degrees 7 and 2. 


Furthermore, [5]jox> is the only term in the product with a nonzero coefficient, yet the 
polynomial [4] 10x? + [3] 10x° + [8] 10x has three terms with nonzero coefficients. 


Since we will soon use Z, and Z,,[x] to obtain information about polynomials in Z[x] and 
Q[x], it is reasonable to wonder how does one use Z, to learn about Z? To begin answering 
this question, recall that addition and multiplication in Z, are defined as 


(1) [a|n + [bln =[a+b], and [a]n- [bln =[a-b]n, 
for all a, b € Z. Next, if n > 2, consider the function 
On: LZ Zn 
defined as 
On (4) = [a]n, 
for all a € Z. By (1), it follows that 
On(a4 +b) =oy(a)+on(b) and o,(a-b) = on (a) -on(d), 


www.elsevierdirect.com 


370 Chapter 9 


for all a, b € Z. Observe that the equations satisfied by o,, are identical to those satisfied by 
automorphisms, and this leads us to 


Definition 9.4. If R and S are rings, we call a function t : R > S a homomorphism if 
t(a+ b) = t(a)+ t(b) and t(a-b) = t(a)- tb), 


foralla, be R. 


At various points in this chapter, such as in Definition 9.4, we refer to rings instead of 
commutative rings. When we do this, we are no longer assuming that multiplication is 
commutative. The reason for occasionally dealing with rings instead of commutative rings is 
because various ideas and concepts, such as homomorphisms, make perfect sense even when 
not assuming that multiplication is commutative. 


In Definition 8.23, we defined homomorphisms of groups. It should come as little surprise that 
Definitions 8.23 and 9.4 are identical except that homomorphisms of rings need to deal with 
two binary operations instead of one. In particular, if t : R — S is a homomorphism of rings, 
and if we ignore multiplication, then t is still a homomorphism of groups under addition. 


AS we saw in Chapter 8, group homomorphisms are strongly connected to normal subgroups 
and quotient groups. In a very similar manner, homomorphisms of rings are connected to 
objects known as ideals and quotient rings. To obtain information about the roots and factoring 
of polynomials in Z[x] and Q[x], we are primarily interested in homomorphisms from Z to Zp 
and from Z[x] to Z,,[x]. However, when we prove Kronecker’s Theorem in Chapter 17, we 
will take a deeper and more general look at homomorphisms of rings and their connections 
with ideals and quotient rings. 


If you look back at Lemma 5.9, you will see that it is stated in terms of automorphisms of 
commutative rings. However, having now been introduced to homomorphisms of rings, it is 
easy to see that the proof of Lemma 5.9 actually yields a more general result. Since this more 
general result will be needed throughout this chapter, we record it as 


Lemma 9.5. Suppose t: R — S is a homomorphism of rings. If x1, x2, ..., Xn € R, 
then 


U(X, +X2 +++ +Xp) = 141) + 1X2) +--+ Tn) 
and 


T(X1°X2°++Xn) = T(x1) + T(%2) +++ Tn). 


Polynomials over the Integers and Rationals 371 


m= Examples: Homomorphisms of Rings 
1. Our motivating example was, for every n > 2, to let 
On :L—> Zy 
be defined as o(a) = [a], for alla € Z. 
We can use o,, to now define, for every n > 2, 
T, : Z[x] > Z,[x] 
as 
ti (ana a ee ee eee see a) = 
[@m]nx” + [4m—1]nx” | +--+ +[ainx+[aolns 
for all a; € Z. 


Observe that if f(x) = dmx” +---+a,x+do and g(x) = bixt +++ +b x+d belong 
to Z[x], then the coefficient of x’ in f(x) + g(x) and f(x)- g(x) are, respectively, 

a +b, and aybo + ay_1b, +: +: +a, b)_1 +aob;. As a result, the coefficient of x! in 

Tn (f(x) + g(x)) is [ap +);]n, whereas the coefficient of x’ in tT) (f(x)) + T(g(x)) is 
Lat|n + [Biln. Since o, is a homomorphism, it follows that x’ has the same coefficient 
in both t,( f(x) + g(x)) and T,(f(x)) + tTr(g(x)), for every t > 0. Thus, 


Tr f&) + 8(%)) = Tr f@)) + tr(8@)), 
for all f(x), g(x) € Z[x]. 
Similarly, the coefficient of x’ in t,( f(x) - g(x)) is 

La;bo + a;—1b, ++ +++ a,b) + abi In, 
whereas the coefficient of x! in T,(f(x)) - T(g(x)) is 

[ai IntBoln + [arn Lbtdn ++ °° + Lain brn + [ola lon. 

However, since 0, is a homomorphism, Lemma 9.5 asserts that 

Larbo + a;-1b, ++ ++ +a bj-1 +40) \n = 

Larboln + Lar-1b in +++ + Lab n + Laobin = 


Lag |nlBoln + Lar—t In Bin +++ + [ain [bt-1n + [aoln[beln- 


www.elsevierdirect.com 


372 Chapter 9 


Therefore, the coefficient of x’ is the same in both t,( f(x) « g(x)) and 
Tr (f(X)) - Tr(g(x)), for every t > 0, so 


Tal f(x): g(&)) = TC f(@)) - (g)), 


for all f(x), g(x) € Z[x]. As a result, t, preserves both addition and multiplication 
and is indeed a homomorphism of rings. 


To illustrate this, let f(x) = 6x* + 2x? —7 and g(x) =4x+6. Ifn =5, we now have 
t5(f(x)) =x? +[2]sx7+[3]s and t5(g(x)) = [4]5x+ Is. 
Therefore, 


15 (f(x) + g(x) = t5(6x7 + 2x7 +. 4x — 1) = x3 + [2]5x7 + [4]5x 4+ [4]s 


and 
t5( f(x) + t5(g(x)) = (x? + [2]5x7 + [3]5) + (45x + L115) = 
x3 + [2]5x? + [4]5x + [4]s. 
Similarly, 
15( f(x) « g(x)) = t5(24x7* + 442° + 12x” — 28x — 42) = 
[4]sx* + [4]5x° + [2]sx? + [2]sx+ [3] 
and 


t5(f(x)) - t5(g(x)) = (0? + [2]5x? + [3]s) - (4]sx+ (1s) = 
[4]sx? + [4]sx° + [2]sx? + [2]5x+ Bs. 


2. We can now generalize our previous example. Suppose ao: R—> Sisa 
homomorphism of rings. Next, define 


t: R[x] > S[x] 
as 
(GmxX™ + Amie" | + +++ +ayx+ag) = 
O(n) x" + O(dm—1)X"—' +++ +0(a1)x + 4(a9), 


for all a; ¢ R. The same argument that shows that each T, in the previous example is 
a homomorphism of Z[x] also shows that t is a homomorphism of R[x]. 


www-.elsevierdirect.com 


Polynomials over the Integers and Rationals 373 


. R[x] isa ring and we can let 


tT: R[x] — RE] 
be defined as t(f(x)) = f(x’), for all f(x) € R[x]. Then t is a homomorphism as 
U( f(x) + g(x) = FO?) + g(x*) = r( f(x) + r(g(2)) 
and 
U( f(x) + g(x) = fO")- g(x*) = t( fa) - r(¢Q)) 
for all f(x), g(x) € R[¥]. 
For an example of how t behaves, let f(x) = 7x —5 and g(x) = 3x +23. Then 


t( f(x) + g(x)) = (10x + 18) = 10(x?) + 18 = 10x7 +18 


and 
t(f(x)) + t(g(x)) = (7x —5) + 13x +23) = 
(7(x?) — 5) + (3(x7) +23) = 10x? + 18. 
Furthermore, 
t( f(x) g(x)) = t(21x? + 146x — 115) = 
21(x)? + 146(x?) — 115 = 21x74 146x? — 115 
and 


t(f(x)) - t(g(x)) = t7x — 5) -t3x+23) = 
(7(x7) — 5) - (3(x?) +23) = 21x44 146x7 — 115. 


. R[x] and R are both rings, and we let 


t:RE]>R 
be defined as t(f(x)) = f(2), for all f(x) € R[x]. Then r is a homomorphism as 
t( f(x) + g(x) = f2) + g(2) = (f(x) + U(g(%)) 
and 
t( f(x) -g(x)) = f2)- 82) = t(fx))- 1g (x)); 
for all f(x), g(x) € Ra]. 


www.elsevierdirect.com 


374 Chapter 9 


For an example of the behavior of r, let f(x) = 5x —11 and g(x) = 6x? +2x+4+9. Then 
(f(x) + g(x)) = t(6x” + 7x — 2) = 6(2)* +7(2) —2 = 36 
and 
t( f(x) + t(g(x)) = (2) — 11) + (6(2) +22) +9) = (-1) + G7) = 36. 
Similarly, 
t( f(x) - g(x)) = 1(30x° — 56x” + 23x — 99) = 30(2)? — 56(2)” + 23(2) — 99 = —37 
and 
t(f(x)) - t(g(x)) = t(5x — 11)-1(6x? +2x+9) = 
(5(2) — 11) - (6(2)° + 2(2) + 9) = (-1)- (37) = -37. 


5. Every automorphism of a commutative ring or field is an example of a 
homomorphism of rings. In Chapter 5, we were introduced to Gal(L/K), the 
automorphisms of a commutative ring L that are the identity map on the 
commutative ring K. Therefore, all the elements of Gal(L/K) are examples of 
homomorphisms of rings. 


Exercises for Section 9.1 
In exercises 1-3, find the degree, leading term, and trailing term of each product in Q[x]. 
1. (7x9 —4x3 — 11x? + 54x — 13)- Ox® — 16x” +.4x° — 43x3 — 22x 4 6) 
2. (5x3 — 7x2 +8x—3)- (11x? + 24x — 16x4* + 61x? — 20x) - (4x5 — 10x? + 57x? — 19x +7) 
3. (2x7 -6x+5)4 
In exercises 4—6, find the degree, leading term, and trailing term of each product in Zs[x]. 
4, ([3]sx!* + [2]5x* + [2]5x? + [4]s) - (25x? +28 + [3]sx4 + [4]5x°) 
5. ([3)sx° +23 + [4]5x7) - ([4]sx8 + [2]5x° + [3]5x° + [3]s) - ((2]5x* + [2]5x? + [3]5x7) 
6. ((2)sx’ +23 + [3]sx)? 
In exercises 7 and 8, find the degree, leading term, and trailing term of each product in Zg[x]. 
7. ([S]sx! + [4]sx7) - (6]sx° + [3]sx? + [2]8x) 
8. ([2]sx+1)° 


www-.elsevierdirect.com 


Polynomials over the Integers and Rationals 375 


In exercises 9 and 10, find the degree, leading term, and trailing term of each product in Zjo[x]. 
9. ([8hiox? + [5]iox + [2]10) - (5]}10x7 + [6]10) 
10. ((4]10x+[7]10)° 
In exercises 11-18, let o : Q[x] — Q be the homomorphism of rings defined as 
o(p(x)) = p(5), for all p(x) € Q[x]. In addition, let 
f(x) =2x° +1, g(x) =2x-7, A(x) = x7 —6x4+5. 
11. Compute o( f(x)), o(g(x)), and o(h(x)). 


12. Compute f(x) + g(x), o( f(x) + g(x)), and o( f(x)) + o(g(x)). Then compare 
o( f(x) + g(x)) and of f(x)) + 0(g(x)). 


13. Compute g(x) +h(x), o(g(x) +A(x)), and o(g(x)) + o(h(x)). Then compare 
o(g(x) +h(x)) and o(g(x)) + o(A(x)). 


14. Compute f(x) +h(x), o( f(x) +h(x)), and o( f(x)) + o(h(x)). Then compare 
o( f(x) + h(x)) and o( f(x)) + o(A(x)). 


15. Compute f(x) - g(x), o( f(x) - g(x)), and o( f(x)) -o(g(x)). Then compare o( f(x) - g(x)) 
and o( f(x)) -o(g(x)). 


16. Compute g(x) -h(x), o(g(x) -h(x)), and o(g(x)) -o(h(x)). Then compare o(g(x) -h(x)) 
and o(g(x))-o(h(x)). 


17. Compute f(x) -h(x), o( f(x) -h(x)), and o( f(x)) -o(h(x)). Then compare o( f(x) -h(x)) 
and o( f(x)) -a(h(x)). 


18. Describe, as simply as possible, the elements of Ker(o) and Im(o). 


In exercises 19-26, let t : R[x] — C be the homomorphism of rings defined as o(p(x)) = p(i), 
for all p(x) € R[x]. In addition, let 


f(x) =5x+4, g(x) = 20° 43x? 42x43, h(x) = 4x? —6x4+7. 
19. Compute t(f(x)), tT(g(x)), and t(h(x)). 


20. Compute f(x) + g(x), t(f(x) + g(x)), and t( f(x)) + t(g(x)). Then compare 
T( f(x) + g(x)) and t( f(x)) + t(g(x)). 

21. Compute g(x) +h(x), t(g(x) +A(x)), and t(g(x)) + t(h(x)). Then compare 
(g(x) +A(x)) and t(g(x)) + t(h(x)). 

22. Compute f(x) +h(x), t( f(x) +h(x)), and t(f(x)) + t(h(x)). Then compare 
(f(x) +A(x)) and t(f(x)) + Th(x)). 


376 Chapter 9 


23. 


24. 


25. 


26. 


Compute f(x) - g(x), t f(x) - g(x)), and t(f(x)) - 1(g(x)). Then compare t( f(x) - g(x) 
and t(f(x))- t(g(x)). 


Compute g(x) - h(x), t(g(x)-h(x)), and t(g(x))- t(h(x)). Then compare t(g(x) -h(x)) 
and t(g(x))-t(A(x)). 


Compute f(x) -h(x), t( f(x) -A(x)), and t( f(x))- tA(x)). Then compare t( f(x) - h(x)) 
and t(f(x))- t(h(x)). 


Describe, as simply as possible, the elements of Ker(t) and Im(z). 


In exercises 27-34, let o : Z[x] — C be the homomorphism of rings defined as 
o(p(x)) = p(2+1), for all p(x) € Q[x]. In addition, let 


27. 
28. 


29. 


30. 


31. 


32. 


33. 


34. 
35; 


36. 


f(x) =6x—11, g(x) =3x7-—2x49, h(x) =x -—3x*— x45. 


Compute o( f(x)), o(g(x)), and o(h(x)). 


Compute f(x) + g(x), o( f(x) + g(x)), and o( f(x)) + o(g(x)). Then compare 

o( f(x) + g(x)) and of f(x)) + o(g()). 

Compute g(x) + h(x), o(g(x) +A(x)), and o(g(x)) + o(h(x)). Then compare 

o(g(x) +h(x)) and o(g(x)) + o(h(x)). 

Compute f(x) +h(x), o( f(x) +h(x)), and o( f(x)) + o(h(x)). Then compare 

o( f(x) + A(x) and of f(x)) + o(h(x)). 

Compute f(x) - g(x), o( f(x) - g(x)), and o( f(x)) -o(g(x)). Then compare o( f(x) - g(x) 
and o( f(x)) -o(g(x)). 

Compute g(x)-h(x), o(g(x)-h(x)), and o(g(x))-o(h(x)). Then compare o(g(x)-h(x)) 
and o(g(x)) - o(h(x)). 

Compute f(x) -h(x), o( f(x): h(x)), and o( f(x)) -o(h(x)). Then compare o( f(x) -h(x)) 
and o( f(x)) -o(h(x)). 

Describe, as simply as possible, the elements of Ker(o) and Im(o). 


Let f : R— Sand g: S > T be homomorphisms of rings. Show that the composition 
go f:R— T is also a homomorphism. 


If F is a field and F'[x] is the set of polynomials with coefficients in F, let 
o: F[x] > F[x] and t: F[x] — F[x] be defined as o( p(x)) = p(x + 1) and 
t(p(x)) = p(x — 1), for all p(x) € F[x]. 

(a) Show that both o and t are homomorphisms of rings. 


(b) Show that the compositions o 0 t and too are both the identity map on F[x]. 


Polynomials over the Integers and Rationals 377 


(c) Show that both o and t are automorphisms of F[x]. 


(d) Show that f(x) € F[x] is reducible in F[x] if and only if f(x+ 1) is reducible in 
F[x]. 


37. Let R be an integral domain with only a finite number of elements. 
(a) Ifae Randa #0, show that the function t: R — R defined as t(r) = ar, for all 
r € R, is an injection. 


(b) Show that if a € R anda #0, then a has a multiplicative inverse in R. 
(c) Prove that R must be a field. 


The remaining exercises in this section will deal with various properties of rings. In these 
exercises, unless stated otherwise, we will not be assuming that multiplication is commutative. 
To avoid confusion, when dealing with rings R, S, we will let Or, 0s denote, respectively, the 
additive identities of R and S. Similarly, we will let 1p, 15 denote, respectively, the 
multiplicative identities of R and S. 


38. Let r be an element of an integral domain R such that r? = r. Show that either r = Or or 
r= Lp. 


39. Lett: R— S be a homomorphism of rings. If S is an integral domain, show that either 
T(1pR) = Os or T(1R) = 1s. 


40. Let r be an element of an integral domain R such that r° = r. Show that either r = Op, 
r= 1p, orr = —1p, where —1p is the additive inverse of 1p. 


41. Lett: R— S bea homomorphism of rings. Show that t(0r) = Os. 


42. Lett: R— S bea homomorphism of rings which is also a bijection. Show that R is an 
integral domain if and only if S is an integral domain. 


43. Ifo: R— Sisa homomorphism of rings and a € Ker(o), show that ar, ra € Ker(o), for 
allre R. 


44. Lett: R— S be ahomomorphism of rings which is also an injection. Show that if S is a 
commutative ring then R is also a commutative ring. 


45. Leto: R— S beahomomorphism of rings which is also a surjection. Show that if R is a 
commutative ring then S is also a commutative ring. 


46. Leto: R[x] — R[x] be the homomorphism of rings where o( f(x)) = f(2x — 1), for all 


f(x) € R[x]. 
(a) Find o(x7). 


(b) Find o(3x?—4x+1). 


378 Chapter 9 


(c) Find g(x) € R[x] such that o(g(x)) = x. 
(d) Explain why o7! exists and then find the formula for o~!( f(x)), for all f(x) € R[x]. 


47. Leto: R[x] — R[x] be the homomorphism of rings where o( f(x)) = f(x‘), for all 
f(x) € R[x]. 


(a) Iso injective? Explain your answer. 


(b) Is ois surjective? Explain your answer. 
48 Lett: Z— Zbea homomorphism of rings. 
(a) Show that either 711) =0 or 171) = 1. 
(b) Conclude that either t(m) = 0, for all n € Z, or T is the identity map. 


For exercises 49-50, please read the following: 


A subset J of a ring R is called an ideal if J is a subgroup of R under addition and whenever 
aélandre R, we have ar, ra € I. Observe that exercise 43 along with the fact that the kernel 
of every group homomorphism is a subgroup combine to show us the kernel of every ring 
homomorphism is an ideal. 


49. Ifn EN, letnZ= {n-t|t € Z}. Show that the set nZ is an ideal of Z. 

50. Suppose / is an ideal of the ring Z such that J consists of more than the single element 0. 
(a) Show that / must contain a positive integer. 
(b) Show that / contains a smallest positive integer. 


(c) Ifn is the smallest positive integer in J, show that any a € J must be a multiple of n. 
(Hint: Apply the division algorithm and divide a by n and then examine the 
remainder.) 


(d) Conclude that J is equal to the set nZ = {n-t|t € Z}. 
(e) Use this exercise and the previous exercise to describe all ideals of the ring Z. 


51. Let J be an ideal of a ring R. Show that J = R if and only if lr € J. 


52. Generalize exercise 49 and show that if R is a commutative ring and a € R then the set 
aR = {a-t|t € R} is an ideal of R. 


53. Let R be acommutative ring. Show that R is a field if and only of the sets {Or} and R are 
the only ideals of R. 


54. Leto: Z,[x] > Z,[x] be defined as o( f(x)) = f(x)?, for all f(x) € Z,[x]. 
(a) Show that o is a homomorphism of rings. 
(b) Contrast o to the function t : Z,[x] > Z)[x] defined as t( f(x)) = f(x?), for all 
f(x) € Zp\x]. 


Polynomials over the Integers and Rationals 379 


9.2 Rational Root Test and Irreducible Polynomials 


We now have the tools needed to examine the roots and factoring of polynomials in Z[x] and 
Q[x]. First, we will deal with the question of whether there is an algorithm for finding all the 
rational roots of a polynomial with rational coefficients. We will show that there is an 
algorithm such that for any f(x) € Q[x], we can write down a short list of rational numbers 
that are candidates for being roots of f(x). Usually, not all of our candidates will be roots of 
f(x). In fact, sometimes none of our candidates will be roots of f(x). However, the important 
property of our list of candidates will be that it is guaranteed to contain all the rational roots of 
f(x). Therefore, if we simply plug each of our candidates into f(x) and see which ones give 0 
as an answer, we will have succeeded in finding all the rational roots of f(x). 


Before stating how we come up with our list of candidates, we need to make an observation 
that will simplify the problem. If f(x) € Q[x], then obviously all the coefficients of f(x) are 
rational numbers. Therefore, there exists a positive integer M such that the polynomial 

M .- f(x) has only integer coefficients. Furthermore, M - f(x) has the same roots as f(x). Thus, 
whenever we are looking for the roots of a polynomial in Q[x], we can always simplify the 
problem by looking at an appropriate polynomial in Z[x]. 


For example, if we want to find the roots of 
1 1 1 1 
fiz) = gx) — 3x — 35-5, 


we can multiply f(x) by 6 to obtain the polynomial 


F(x) = x° — 2x? —2x—3. 


To find the roots of f(x), it suffices to find the roots of F(x). In light of this, we can state our 
result in terms of polynomials in Z[x]. 


Theorem 9.6—The Rational Root Test. Let 
f(x) = ayx" 4a, 4x"! +s tayx +a € Z[x], 


where an, ag 4 9. If a rational number is a root of f(x) and is written in lowest terms as = 
then r\ag and s|apn. 


Before proving Theorem 9.6, we will look at several examples. 


Examples 


1. Let us return to the polynomial F(x) = x? —2x* —2x —3. The Rational Root Test 
asserts that if © is a rational root of F(x) in lowest terms, then r| —3 and s|1. 


Therefore, the only possibilities for r are +1, +3, and the only possibilities for s are 


380 Chapter 9 


+1. As a result, the only possibilities for © are +1, +3. Having produced a list of four 
candidates, we now plug each candidate into F(x) and observe that F(1) = —6, 
F(—1) = —4, FG) =0, and F(—3) = —42. Therefore, 3 is the only rational root of 
MAO HO 8, 


2. Let g(x) = 3x3 —i; 


and turn our attention to the polynomial 


in order to apply the Rational Root Test, we multiply g(x) by 10 


G(x) = 10g(x) = 15x° — 14. 


The test now asserts that if = is a rational root of G(x) in lowest terms, then r| — 14 
and s|15. Now the only possibilities for r are £1, 2, +7, £14 and the only 
possibilities for s are +1, +3, +5, +15. As a result, the only possibilities 

for = are 


1 2 7 14 
He?) eli oe eS 
3° 3° 3 3 


1 2 7 14 1 2 7 14 
+o, 5, +, t— , + _ , + — 5. 
> oO 2° D3 IS 15 «1515 


If G(x) has any rational roots, they must be on our list of 32 candidates. But none of 


these candidates give 0 as an answer when plugged into G(x). Therefore, we can 
conclude that G(x) has no rational roots. 


Note that the Intermediate Value Theorem tells us that G(x) does indeed have a real 


14 
15 


1. . 
Therefore, the Rational Root Test also tells us that (3) 3 js irrational. 


root. In fact, it is not hard to see that (+2 )3 is the real root of this polynomial. 


3. In light of the previous example, it is not surprising that the Rational Root Test can 
be used to show that many real numbers are irrational. For example, if we look at 
x” — 2, then the test says that the only possible rational roots of x” —2 are +1, +2. 
Since none of these four candidates are roots of x” — 2, we know that x” — 2 has no 
rational roots. But since V2 is a root of x —2, we have yet another proof that V2 is 
irrational. Similarly, the Rational Root Test says that the only possible rational roots 
of x° —3 are +1, +3. But since none of these four candidates are roots of x> —3, and 
since 35 is a root of x5 — 3, we see that 35 is also irrational. 


4. Let h(x) = 2x4 —7x3 — 18x? — 20x — 11. The Rational Root Test tells us that if isa 
rational root in lowest terms, then r| — 11 and s|2. Therefore, the only possible 
values of r are +1, +11, and the only porsivic values of s are +1, £2. Asa result, 
our list of candidates for © is +1, +11 4 ea u oe plug each of these eight 
candidates into h(x), we vill find that —1 ana = are the only rational roots of 
2 = Tx = 182" — 20011, 


www-.elsevierdirect.com 


Polynomials over the Integers and Rationals 381 


5. Let w(t) = +324 — 423 — 1277; the Rational Root Test discusses polynomials with 
nonzero constant term. However, if the constant term is not zero, we can factor out 
as many powers of t as necessary to obtain a new polynomial where the constant 
term is zero. For example, w(t) = 17(t? + 31? — 4t — 12). Applying the test to 
1° +317 —4t — 12, the only possible rational roots of 1° +3? —4t— 12 are 

1,42,+3, +4, +6, +12. If we check each of these 12 candidates, we will see that 

—3, —2, 2 are the only rational roots of ° +317 — 4t — 12. Since f is also a factor of 


w(t), we know that 0 is also a root of w(t). Therefore, —3, —2, 0, 2 are all the rational 
roots of w(t). 


Proof of Theorem 9.6. Suppose = is a rational root of f(x), which is in lowest terms. Plugging 


~ into f(x) yields 
r\n r\n-l r 
0=a, (“) +an-1 (=) +e+++ay (=) +a0. 
Ss Ss Ss 
Multiplying this equation by s” gives us 
(2) 0=a,r tar ste bags as". 


At various points in the proof, it might be helpful to assume that r > 0 or that s > 0. Since 


< = —, this will not be a problem. Furthermore, if r or s is equal to 1, then it is certainly a 


divisor of every integer. Therefore, when attempting to show that r|ag, we may assume that 
r > 2. Similarly, when attempting to show that s|a,, we may assume that s > 2. 


Since we may assume that r > 2, let o, be the homomorphism of rings from Z to Z, where 
o,(a) = [a],, for all a € Z. Since r and s are relatively prime, Theorem 7.8 tells us that [s], is 
invertible in Z,. Also remember that if b € Z, then [b], = [0], if and only if b is a multiple of r. 
Now, if we plug both sides of (2) into the function o, and then use Lemma 9.5, we obtain 


[0], = 0,(0) = 07 (anr" +4y—ir"” 's+++-+airs"| +.aps") = 
0 (Gnr") +0;(an—1r"'8) +--+ +0,(airs"') +0;(aos") = 
[0], + [0], +--+ [0]-+ [ao], [s];. 
Thus, 
[ao], [s}” = [0], 


in Z,. However, since [s], is invertible in Z,, Proposition 7.7 tells us that [s]? is also invertible 
in Z,. Multiplying our last equation by the inverse of [s]’’ now tells us that [ao], = [0]-. As a 
result, ao is a multiple of r, so r|ao, as desired. 


www.elsevierdirect.com 


382 Chapter 9 


The proof that s|a, is very similar to the preceding argument. In this case, we may assume that 
s > 2, and we then let o, : Z— Z, be the homomorphism where o,(a) = [a];, for all a € Z. 
Analogous to the computation in the preceding paragraph, if we apply o, to both sides of (2), 
we obtain 


[ans ° [r]? = [0]s. 
Since r and s are relatively prime, [7], is invertible in Z,;. Therefore, [r]’ is also invertible in 


Zs, and if we multiply the previous equation by the inverse of [r]’, we obtain [a,]; = [0]s. 
Thus, a, is a multiple of s and slay. 


If we apply the Rational Root Test to a polynomial with leading coefficient 1 and nonzero 
constant term, the test says that any rational root must be of the form =, where s is a divisor of 
1 and r divides the constant term. But this means that s must be +1, which immediately tells 
us that “ is an integer that divides the constant term. We can use this observation to prove 


Corollary 9.7. If f(x) € Z[x] has leading coefficient 1, then any nonzero rational root of f(x) 
must be an integer which divides the trailing coefficient. 


Proof. By factoring out the largest power of x that is a common factor of all the terms of f(x), 
we can write f(x) = x’ g(x), where m > 0 and the constant term of g(x) is nonzero. Therefore, 
the roots of g(x) are precisely the nonzero roots of f(x). Our previous observation says that 
any rational root of g(x) must be an integer that divides the trailing coefficient of g(x). 
However, f(x) and g(x) have the same leading and trailing coefficients. Thus, any nonzero 
rational root of f(x) must be an integer that divides the trailing coefficient of f(x). 


m Examples 


1. Let f(x) = x® + 17x° +53x4 + 13x3; then Corollary 9.7 says that the only possible 
nonzero rational roots of f(x) are divisors of 13. As a result, +1, +13 are the only 
four candidates for nonzero rational roots. However, f(1) = 84, f(—1) = 24, 
f(U13) = 12,681,084, and f(—13) = 0. Thus, 0 and —13 are the only rational roots 


of f(x). 


2. Let h(x) = x° + 9x3 +2; Corollary 9.7 asserts that the only possible rational roots 
of h(x) are divisors of 2. Therefore, +1, +2 are the only candidates. However, 
h(1) = 12, h(—1) = —8, h(2) = 106, and h(—2) = —102. As a result, h(x) has no 
rational roots. On the other hand, the Intermediate Value Theorem tells us that h(x) 
has a real root between —1 and 0. Furthermore, since the derivative of h(x) is 


5x* +27x?, which is never negative, we know that h(x) is always increasing. Thus, 


www-.elsevierdirect.com 


Polynomials over the Integers and Rationals 383 


h(x) cannot have more than one real root. In summary, we know that /(x) has exactly 
one real root, and this root Is an irrational number that lies between —1 and 0. 


We can now begin to address the question of which polynomials in Z[x] and Q[x] can be 
written as a product of polynomials of smaller degree. To do so, we need to introduce some 
terminology that will then be followed by a large number of examples. 


Definition 9.8. Let f(x) be a polynomial of degree at least one with coefficients in an integral 
domain R. We say that f(x) is reducible in R[x] if there exist g(x), h(x) € R[x], both with 
smaller degree than f(x), such that f(x) = g(x)-h(x). If f(x) € R[x] has degree at least one 
and is not reducible in R[x], then we say that f(x) is irreducible in R[x]. 


It is important to be aware that whether f(x) € R[x] is reducible or irreducible depends not 
only on f(x) but also on R. This will be illustrated in the following examples. In these 
examples we will make frequent use of the fact, which we will prove in Chapter 12, that if F is 
a field and if a € F is aroot of f(x) € F[x], then there is some g(x) € F[x] such that 

f(x) = (x—a@) g(x). Observe that if, in addition, f(x) has degree at least 2, then both x — @ and 
g(x) have smaller degree than f(x). As a result, we now know that if F is a field, then any 
f(x) € F[x] of degree at least 2 with a root in F, must be reducible in F[x]. 


1. Polynomials of Degree 1 with Coefficients in a Field 
All polynomials of degree 1 are irreducible. To see this, if we could factor 
S(x) = g(x) - h(x), where f(x) has degree 1, then 


1 = deg( f(x)) = deg(g(x) -h(x)) = deg(g(x)) + deg(h(x)). 


Therefore, either g(x) or h(x) would have degree 1. But this means it is impossible to 
write f(x) as a product of two polynomials, both of which have smaller degree than f(x). 
Thus, f(x) is irreducible. 


2. Polynomials of Degree 2 with Coefficients in a Field 
We already know that if f(x) € F[x] is of degree 2 and has a root in F, then f(x) is 
reducible in F'[x]. Conversely, if f(x) is reducible in Fx] there exist g(x), h(x) € F[x], 
both of which have smaller degree than f(x), such that f(x) = g(x) - h(x). Since 
deg(f(x)) = 2, it follows that both g(x) and h(x) must have degree 1. As a result, 
g(x) =ax+b, where a,b € F and a 4 0. However, since F is a field, this tells us that 
—ba7! belongs to F and is a root of g(x). Therefore, 


f(—ba~') = g(—ba!) -h(—ba!) = 0-h(—ba~') = 0. 


Hence, f(x) does have a root in F’. As a result, if f(x) € F[x] has degree 2, then f(x) is 
irreducible in F[x] if and only if it has no roots in F. 


384 Chapter 9 


Now let us look at several examples of polynomials of degree 2. 


(a) 


(b) 


(c) 


(d) 


(e) 


(f) 


f(x) = x? —7x— 18; this polynomial is reducible in Z[x] as f(x) = (x—9)(x+2). 
Since Q, R, and C are all larger than Z, this polynomial is also reducible in Q[x], 
R[x], and C[x]. 


g(x) =x? —2; this polynomial is irreducible in Q[x] as g(x) has no roots in Q. Since 
Qis larger than Z, g(x) is also irreducible in Z[x]. However, g(x) does have roots in 
R. Therefore, g(x) is reducible in R and we have 


g(x) = («—V2)(x+ V2). 


Since C is larger than R, g(x) is also reducible in C[x]. 
If we consider the field Q(/2) = {a+ bV2 | a, b € Q}, then g(x) also has roots in 
this field. Therefore, g(x) is also reducible in Q(V2) [x]. 


h(x) = x? +1; this polynomial is irreducible in R[x] as it has no real roots. Since Z 
and Q are smaller than R, h(x) is also irreducible in Z[x] and Q[x]. However, h(x) 
has roots in C, so h(x) is reducible in C[x] and we have 


h(x) = (x—i)(x+). 


k(x) = 2x? —7x +1; this polynomial is irreducible in Q[x], since, by the Rational 
Root Test, it has no rational roots. Observe that the only candidates for rational roots 
are +1, +5, and none of these make k(x) equal to zero. Since Z is smaller than Q, 
k(x) is also irreducible in Z[x]. 

However, k(O) = 1 and k(1) = —4, so the Intermediate Value Theorem tells us that 
k(x) has a real root between 0 and 1. Therefore, k(x) is reducible in R[x]. Since C is 


larger than R, we also know that k(x) is reducible in C[x]. 


v(x) = x7 + [2]5x +[2]s; this polynomial is reducible in Zs[x], since it has roots in Zs. 
You should check that [1]5 and [2]5 are indeed roots of v(x). Therefore, we now have 


v(x) = x° + [2]sx + [2]5 = (@—[1]5)(e— [2]s) = (+ [4]5)(@ + [3]5). 


w(x) = x* +x+[1]o:; this polynomial is irreducible in Z2[x], since it has no roots in 
Z>. To see this, observe that 


w((O]2) = [0]3 + [O]2 +[1]2=[1]o and w({1]}2) = [113 +e +b = Bb =D). 


3. Polynomials with Coefficients in R or C 
By the Fundamental Theorem of Algebra, every polynomial of degree at least 1 in C[x] 
has a root in C. Since polynomials of degree at least 2 with a root in a field are reducible, it 
follows that the only irreducible polynomials in C[x] are those of degree 1. 


Polynomials over the Integers and Rationals 385 


Moving to R[x], we know that whereas there are no polynomials of degree 2 that are 
irreducible in C[x], there are certainly polynomials of degree 2 that are irreducible 
in R[x]. In fact, the quadratic formula tells us that if 


f(x) = ax? ++bx+ce R[x], 


then f(x) has a root in R if and only if b* — 4ac > 0. Therefore, f(x) is reducible in R[x] 
if and only if Vb? — 4ac => 0. As a result, it is easy to determine which polynomials of 
degree 2 are reducible in R[x]. 


Next, we turn our attention to polynomials of degree 3 or more. Recall that the 
Intermediate Value Theorem told us that any polynomial of odd degree in R[x] has a root 
in R. Therefore, every polynomial in R[x] whose degree is odd and exceeds 1 is reducible 
in R[x]. 


As a result, we now need to examine polynomials in R[x] whose degree are even and 
exceed 2. Consider the polynomial 


f(xy) = a aS 
Note that f(x) has no roots in R, but we can factor it in R[x] as 
f(x) = x4 4.427 +3 = (x? +1) (x? +3). 


Observe that the roots of f(x) in C are +i, +i /3. Therefore, there certainly exist 
polynomials of degree at least 4 with no roots in R[x] that are reducible in R[x]. 


More generally, let f(x) € R[x] have degree at least 4. If f(x) has a real root, then it is 
certainly reducible in R[x]. So let us consider the case where f(x) has no real roots. The 
Fundamental Theorem of Algebra tells us that there is some a € C such that @ is a root of 
f(x). Since f(x) has no real roots, a ¢ R. Therefore, @*, the complex conjugate of a, is not 
equal to a. Next, consider the polynomial 


g(x) = (x—a)(x+ 0°) =x? —(ata*)+a-a". 


By Lemma 5.10(b), a+ a* and a@-a* belong to R, so g(x) € R[x]. In Chapter 12, we will 
show that we can write f(x) = g(x)- h(x), where h(x) € R[x]. Using this fact, observe that 
since deg( f(x)) = 4, then h(x) must also have degree at least 2. Therefore, f(x) has been 
written as a product of two polynomials in R[x] of smaller degree. Thus, f(x) is reducible 
in R[x]. As a result, every polynomial of degree at least 3 in R[x] is reducible in R[x]. 


Polynomials of Degree 3 with Coefficients in a Field 

The situation for polynomials of degree 3 is virtually the same as for those of degree 2. We 
know that if f(x) € F[x] is of degree 3 and has a root in F, then f(x) is reducible in F. 
Conversely, if f(x) = g(x)- h(x), where g(x), h(x) both have smaller degree than f(x), 


386 Chapter 9 


then either g(x) or h(x) must have degree 1. Therefore, either g(x) or h(x) must have a 
root in F, which immediately tells us that f(x) has a root in F. As a result, if f(x) € F[x] 
has degree 3, then f(x) is irreducible if and only if it has no roots in F. 


Given a polynomial in Q[x], we can use the Rational Root Test to determine if it has any 
rational roots. Therefore, in Q[x], we now a have straightforward procedure to determine 
when polynomials of degree 3 are reducible. We now consider several examples. 
(a) f(x) = 5x? + 3x? — 35x — 21; by the Rational Root Test, the 16 candidates for 
rational roots of f(x) are £1, +3, +7, +21, +i, +3, +i, a e You can check that 
3 


—s is aroot of f(x), so f(x) is reducible in Q[x]. In fact, 


f(x) = 5x? +.3x7 — 35x —21 = (5x +3)(x? —7) 
in Q[x]. 


(b) g(x) = 7x9 — 8x7 +5x+2; by the Rational Root Test the eight candidates for rational 
roots of g(x) are +1, +2, +f, +2, None of these candidates make g(x) equal to zero, 
so g(x) is irreducible in Q[x]. 


(c) h(x) =x? +x+[1]; this polynomial is irreducible in Z2[x], since it has no roots in 
Z». To see that neither [0]2 nor [1]2 are roots, observe that 


h(Ohb) = (012 +fh+2=Uhk and AC) =0+Ubk+0h = Bb =. 
(d) j(x)= [2]7x°? +x + [1]; this polynomial has the root [4]7 € Z7. Observe that 
i((4)) = (2]7- (45 + 47 + [1 = [128]7 + 47 + 1 = 
[2]7 + [4]7 + (17 = [7]7 = [0]. 
Therefore, x — [4]7 = x+[3]7 is a factor of j(x), and we have 
iQ) = 2h +x+[1h = +131) (2x? +x+[5]) 
in Z7[x]. 


5. Polynomials of Degree at Least 4 with Coefficients in a Field 

The situation for polynomials of degree 4 and the preceding is different from those of 

degree 2 or 3. Certainly, if f(x) € F[x] has a root in F, then f(x) is reducible in F'[x]. 

However, a polynomial of degree 4 or more can fail to have a root in F yet still be 

reducible in F'[x]. 

(a) f(x) =x44+2°+ 6x? +5x+1; the Rational Root Test shows that this polynomial has 
no roots in Q. The only candidates are +1 but f(1) = 14 and f(—1) = 2. On the other 
hand, f(x) is reducible in Q[x] as 


f(x) = x4 423 4627 45x41 = (0°45)? 4x41). 


Polynomials over the Integers and Rationals 387 


Note that neither of the factors of f(x) can have a root in Q, since any root of a factor 
of f(x) would automatically be a root of f(x). 


(b) g(x) =x*+x?+[1)):; this polynomial has no roots in Z» as 
g((0]2) = [0] + 10]3 +112 =[12 and g((1Jo) =[115 +013 +0 = Bb =[h. 
However, g(x) is reducible in Za[x] as 
g(x) = xt 497 + [1p = OP? Het [207 +4 [1 hb) = 0? +x +01)’. 


Before continuing on, we should mention an easy but important point that is worth 
keeping in mind. Suppose E, F, L are fields with E C F C L. If f(x) € F[x] is reducible 
in F[x], then f(x) is certainly reducible in L[x]. Similarly, if f(x) € F[x] is irreducible in 
F[x] and if f(x) € E[x], then f(x) is also irreducible in E[x]. 


As indicated in our examples, there is no difficulty in determining when polynomials in 
R[x] and C[x] are reducible. In addition, we have a straightforward procedure for 
determining when polynomials of degree at most 3 in Q[x] are reducible. However, in 
Q[x] the situation for polynomials of degree at least 4 can become quite difficult. 
Although the Rational Root Test provides us with an algorithm for finding rational roots, 
it cannot, by itself, be used to determine whether a polynomial of degree at least 4 is 
reducible. 


Exercises for Section 9.2 


In exercises 1—14, find all rational roots of the given polynomial. 


1. 


S 


SOP OP SN Ow Re TE 


2x7 +5x*-—x-1 

3x3 — 19x? + 34x — 14 

Ax®— 11 

x= 54 Fe = 10 

x* + 10x? + 35x? + 50x +24 
ea Te 15 

x4 +713 +2x+14 
P+4x+3 

x+ — 8x? —9 


6x7 +x? — 18x +8 


388 Chapter 9 


11. 
12. 
13. 
14. 
15. 
16. 
17. 


18. 


19. 


20. 


21. 
22. 
23. 
24. 


3x4 — 6x3 +5x—10 

6x3 — 8x7 +5 

x3 +2x* —5x-6 

48x° —74x° — 17x +30 

Use the rational root test to prove that 73 is irrational. 
Use the rational root test to prove that 113 is irrational. 


Let g(x) = x*— 16x? +4. 
(a) Verify that J/3++4/5 is a root of g(x). 


(b) Use the rational root test to show that g(x) has no rational roots. 


(c) Use parts (a) and (b) to show that J/3+4/5 is irrational. 


(d) Why do you now immediately know that /3 — /5, —/3+4+ J/5, and —J/3 — J/5 are 
also irrational? 


Let h(x) = x4 — 14x? +9. 
(a) Verify that J/2+,4/5 is a root of h(x). 


(b) Use the rational root test to show that h(x) has no rational roots. 


(c) Use parts (a) and (b) to show that J/2++/5 is irrational. 


(d) Why do you now immediately know that /2 — /5, —/24+ 5, and —/2— /5 are 
also irrational? 


(a) Find a monic polynomial in Z[x] of degree 4 which has /2 + /3 as a root. 

(b) Use the rational root test to show that your answer to part (a) has no rational roots. 
(c) Use parts (a) and (b) to show that J/2+ +3 is irrational. 

(a) Find a monic polynomial in Z[x] of degree 4 that has 2,/2 — 3 as a root. 

(b) Use the rational root test to show that your answer to part (a) has no rational roots. 
(c) Use parts (a) and (b) to show that 2/2 — V3 is irrational. 

List all monic irreducible polynomials of degree 2 in Zo[x]. 

List all monic irreducible polynomials of degree 3 in Zo[x]. 

List all monic irreducible polynomials of degree 4 in Zo[x]. 


List all monic irreducible polynomials of degree 2 in Z3[x]. 


25. 
20. 
27. 
28. 
29. 
30. 
31. 
32. 
33. 
34. 
35. 
36. 
37. 
38. 
39. 
40. 


Polynomials over the Integers and Rationals 


List all monic irreducible polynomials of degree 3 in Z3[x]. 
List all monic irreducible polynomials of degree 2 in Zs[x]. 
Find all integers A such that x” — Ax +1 is reducible in Q[x 
Find all integers A such that x” — Ax — 1 is reducible in Q[x 


Find all integers A such that x” — Ax +2 is reducible in Q[x 


Find all integers A such that x? — Ax —2 is reducible in Q[x 


]. 
]. 
]. 
]. 


389 


If p is a prime number, find all integers A such that x” — Ax + p is reducible in Q[x]. 


If p is a prime number, find all integers A such that x2 —Ax— p is reducible in Q[x]. 


Find all real numbers k such that x2 +kx+k is reducible in | 


R[x]. 


Find all real numbers k such that x* + kx —5 is reducible in R[x]. 


Find all real numbers k such that x7 + kx +5 is reducible in R[x]. 


Find all real numbers k such that x? + kx +k? is reducible in R[x]. 


Write x* + 1 as a product of two monic quadratic polynomia 
Write x* + 2 as a product of two monic quadratic polynomia 
Write x® — 1 as a product of monic irreducible polynomials i 


Write x*® — 4 as a product of monic irreducible polynomials i 


Is in R[x]. 


Is in R[x]. 


n R[x]. 


n R[x]. 


In exercises 41-48, find all the roots in C of the given polynomial. 


41. 
42. 
43. 
44. 
45. 
46. 
47. 
48. 


Qx* +3x—7 
2x7 + 3x+7 

2x3 — 9x? —2x— 15 
9s 0 = 55 ye 8 
x4 —4x? —21 

eee ae 

6x? — 5x? —29x+ 10 
10x3 — 2x7 +45x—9 


390 Chapter 9 


In exercises 49-54, determine if the given polynomials are reducible in Q(./2)[x], where 
Q(./2) denote the field consisting of all real numbers of the form {a+ bV/2| a,b € Q}. 


49. x*-8 
50. x*+1 
51. x7+5x+8 
52. x3°+5 


53. x7—-6x—41 
54. x*-3x+1 


9.3 Gauss’ Lemma and Eisenstein’s Criterion 


When a polynomial f(x) € R[x] is irreducible in R[x], we often use the expression that f(x) is 
irreducible over R. Similarly, if f(x) € R[x] is reducible in R[x], we often say that f(x) is 
reducible over R. In this chapter, we have seen many examples of polynomials that are 
irreducible over one field but are reducible over a larger field. For example, x” — 11 is 
irreducible over Q but is reducible over R. Similarly, x” +5 is irreducible over R but is 
reducible over C. Since Q is larger than Z, this raises the question as to what happens to the 
reducibility of polynomials when we move from Z[x] and Q[x]? In particular, if a polynomial 
f(x) € Z[X] is irreducible in Z[x], can it possibly be reducible in Q[x]? This question is 
answered in 


Theorem 9.9—Gauss’ Lemma. Let f(x) be a polynomial with integer coefficients. If f(x) is 
reducible in Q[x], then f(x) is reducible in Z[x]. More precisely, if f(x) = g(x)-h(x), where 
g(x), h(x) € Q[x], then there exist G(x), H(x) € Z[x] such that f(x) = G(x)- H(x), 
deg(G(x)) = deg(g(x)), and deg( H(x)) = deg(h(x)). 


Intuition. In the proof of Gauss’ Lemma, we will need to show that if we can factor 
f(x) as 
fx) = g(x) h(x) 


in Q[x], then we can modify g(x) and h(x) to obtain polynomials G(x) and H(x) 
such that 


(a) f(x) = Gx) Hx); 
(b) deg(G(x)) = deg(g(x)), deg(H(x)) = deg(h(x)); and 
(c) G(x), M(x) € Z[x]. 


Polynomials over the Integers and Rationals 391 
To see how the procedure for modifying g(x) and h(x) works, consider the polynomial 
f(x) = 2x7 + 13x -—7 € Z[x]. 


Suppose f(x) is factored in Q[x] as 


fx) = 2x? +13x—7 : e yee 
x) = 2x x-7=|=-x x— : 
in haa 


we need to appropriately modify 3x + =! and $x — 7 to obtain polynomials G(x) and H(x) 
with the three properties listed above. 


First, since all the coefficients of 3x + oF belong to Q, we can multiply this polynomial by a 
positive integer such that all the coefficients will become integers. In particular, the number 2 
will do this trick in this case. Therefore, if we multiply 3x + a by 2 and divide $x = $ by 2, 
we obtain 


2 3 21 1/4 2 2 1 
(x) = 2x +131-7=(2(F04 ))G (*-5))=@r+2n (Gx-3). 


Next, we let c be the greatest common divisor of all the coefficients of 3x +21. We then divide 
3x +21 by c and multiply the second polynomial (3x — +) by c. In this case, c = 3 and we 
obtain 


f(x) = 2x? + 13x—7 = (3x4+21) ((Fs- )) = 


me 21 3 ; — 7)(2x-1 
(5 x+ »)-( (Fs-3))=c4 )(2x — 1). 


Now observe that both x +7 and 2x — 1 belong to Z[x], so we have appropriately modified 


3 21 4 2 
okt > and ae =, 


As we look back at this procedure, the steps taken guaranteed that 3x + a would be modified 
into a polynomial with integer coefficients such that the greatest common divisor of the 


coefficients was 1. But there was no guarantee that this procedure would modify $x = $ into a 


; $x — $ was modified 


into a polynomial with integer coefficients. However, the real work in proving Gauss’ Lemma 
is showing that the modified version of the second polynomial always belongs to Z[x]. 


polynomial with integer coefficients. Certainly, in this particular case 


Proof. Suppose in Q[x] we can write 
f(x) = g(x) -h(Q), 


where both g(x) and h(x) have smaller degree than f(x). Since every coefficient of g(x) 
belongs to Q, there is a positive integer m such that all the coefficients of m - g(x) are integers. 


392 Chapter 9 


Therefore, we now have 


1 
Sx) = 8X) h(x) = (m- g(x) (- ‘h() 


Next, let c be the greatest common divisor of all the coefficients in m - g(x). We now have 
1 m c 
f(x) = g(x) h(a) = (m-g@) (—-h@)) = (4-80) (—-h@)). 
m Cc m 


If we let G(x) = 7 - g(x) and H(x) = © - h(x), then we can easily see that 
(a) f(x) = G(x): A(x); 
(b) deg(G(x)) = deg(g(x)), deg(H(x)) = deg(h(x)); and 


(c) G(x) has integer coefficients and the greatest common divisor of the coefficients of G(x) 
is 1. 


In light of this, it now suffices to show that H(x) also has integer coefficients. Since H(x) has 
rational coefficients, there is a smallest positive integer n such that n - H(x) has integer 
coefficients. Therefore, our goal is to show that n = 1. We know that every integer greater than 
1 is divisible by some prime number. Thus, the only positive integer not divisible by a prime 
number is 1. As a result, the way that we will show that n = 1 is to show that 7 is not divisible 
by any prime number. 


Therefore, by way of contradiction, let us assume that there is some prime number p that 
divides n. Let 


p:Z[x] > Z,[x] 
be the homomorphism, where 
P(AmxX”™" + +++ +a,xX +40) = [am] px" + ++» +[ai]px+laolp, 
for all a; € Z. Since f(x) = G(x)- H(x), multiplying this equation by n results in 
(4) n- f(x) = (G(x): (n- H)). 


Every coefficient of n- f(x) is divisible by p, so p(n- f(x)) = [0],. Furthermore, since the 
greatest common divisor of the coefficients of G(x) is 1, at least one of the coefficients of 
G(x) is not divisible by p. Hence, p(G(x)) F [0]p. 


Since p is ahomomorphism defined on elements of Z[x], when we apply p to (4) and use that 
both G(x) and n- H(x) belong to Z[x], we obtain 


[0], = p(n: f(x)) = p(G(x)): (n- HQx))) = p(G))- p(n: H(x)). 


Polynomials over the Integers and Rationals 393 


However, Z,[x] is an integral domain, so the previous equation and the fact that 

p(G(x)) # [0], combine to tell us that p(n - H(x)) = [0]p. As a result, every coefficient of 
n- H(x) must be a multiple of p. This then tells us that every coefficient of 5 - H(x) must be an 
integer. Since p divides n, we now know that : is a positive integer that, when multiplied by 
H(x), gives us a polynomial with integer coefficients. But “ is less than n, and n was chosen to 
be the smallest positive integer that when multiplied by H(x) gives us a polynomial with 
integer coefficients. This is a contradiction, so we can conclude that n = 1 and H(x) does 
indeed have integer coefficients, thereby concluding the proof. 


m Examples 


Consider the polynomial f(x) = x* + 2x? — 1; we will use the Rational Root Test along 
with Gauss’ Lemma to show that f(x) is irreducible over Q. If f(x) were reducible over Q, 
one possibility is that it has a factor of degree |. But having a factor of degree 1 in Q[x] is 
equivalent to having a rational root. However, the Rational Root Test tells us that the 
only possible rational roots of f(x) are £1. Since f(1) = f(—1) = 2, we now know that 
f(x) has no factors of degree 1 in Q[x]. 


Now, since f(x) has degree 4 and has no factor of degree | in Q[x], the only way it could 
possibly be reducible over Q is to be the product of two polynomials in Q[x], both of 
degree 2. However, Gauss’ Lemma now tells us that if f(x) could be factored this way in 
Q[x], then f(x) would also be the product of two polynomials of degree 2 with 
coefficients in Z. Let us now suppose that f(x) = g(x) - h(x), where g(x) and h(x) are 
quadratic polynomials in Z[x]. Since the trailing coefficient of f(x) is —1 , the product of 
the trailing coefficients of g(x) and h(x) is —1. Therefore, one of g(x) or h(x) has a 
trailing coefficient of 1, and the other has a trailing coefficient of —1. Similarly, the 
product of the leading coefficients of g(x) and h(x) must be 1, so either both g(x) and 
h(x) have leading coefficients of 1, or they both have leading coefficients of —1. However, 
in the latter case, f(x) = (—g(x))-(—hA(x)), and both —g(x) and —h(x) have leading 
coefficients of 1. As a result, if f(x) is reducible in Q[x], then we 

can write 


f@) ax 42? —1 = (6 +ax4+ DO? +bx—1), 
where a, b € Z. However, 
(x? +ax+ 1)(x? + bx — 1) = x*+ (a+b)x? + (ab)x*? + (—a+b)x—1. 


Therefore , if 


y toe i = x*+ (a+b)x? + (ab)x* + (—a+b)x— 1, 


www.elsevierdirect.com 


394 Chapter 9 


by comparing the coefficients of each term, we can conclude that a and b must satisfy 
a+b=0, ab=2, and —a+b=0. 


Since there are no integers that can simultaneously satisfy these three equations, we can 
conclude that f(x) is indeed irreducible over Q. 


In the preceding argument, if we did not know Gauss’ Lemma, then we could not have 
immediately reduced to the case where g(x) and h(x) had leading coefficients of 1 with trailing 
coefficients of 1 and —1. As a result, the computations involved in showing that f(x) was 
irreducible over Q would have been lengthier and more involved. Before leaving this example, 
observe that the Intermediate Value Theorem tells us that f(x) has a real root between —1 and 
0 and a real root between 0 and 1. Also note that the derivative of f(x) is equal to 4x* + 4x. 
Thus, f(x) is increasing whenever x is positive and decreasing whenever x is negative. Hence, 
f(x) cannot have more than one negative real root or more than one positive real root. In light 
of this, we know that f(x) has exactly two real roots, neither of which is a rational number. 


When dealing with a polynomial of degree 4 or more in Q[x], it is often quite difficult to 
determine whether or not it is reducible over Q. For example, consider the following nine very 
similar looking polynomials: 


xi 42x? +6, xi 4+3x7+6, 43746, 5°45 +6, 2°4677 +6, 
x47 +6, at 48x72 +6, x44 07746, x*+ 10x +46. 


One easy observation is that none of these polynomials have any real roots as x*+ and x? are 
never negative. So, in particular, none of these polynomials have any rational roots. However, 
this does not guarantee that they are all irreducible over Q. In particular, you may have noticed 
that 


xt 45x? +6 = (x7 +2)(x7 +3) and x447x74+6= (x7 +1)? 4+6). 
But what about the other seven polynomials from our collection? 


The bad news is that there is no algorithm that can determine, for all polynomials f(x) € Q[x], 
whether or not f(x) is reducible over Q. On the other hand, the good news is that there are 
several reducibility tests that, when combined with Gauss’ Lemma, can handle many, many 
special cases. In fact, the next test can be used on the seven remaining polynomials from our 
collection. 


Theorem 9.10—Eisenstein’s Criterion. Let 


FOS) = Gnx" +eay_px” 1+ +» Hayx+ag € Z[x). 


Polynomials over the Integers and Rationals 395 


If there exists a prime number p with the following properties: 
(a) p does not divide ay, 

(b) p divides every coefficient of f(x) other than ap, 

(c) le does not divide ao, 


then f(x) is irreducible in Q[x]. 


Before proving Eisenstein’s Criterion, let us see how useful it can be. 


m Examples 


First, consider the five polynomials 
x +23" 46, x +4376, x +6x°-+6, x 48x46, x +100 +6. 


For all of these polynomials, we can apply Eisenstein’s Criterion with the prime p = 2. 
The leading coefficient of each polynomial is 1, so 2 does not divide the leading 
coefficient. Next, all the other coefficients are even, so 2 does divide the other 
coefficients. The constant term of each polynomial is 6, and 2” = 4 does not divide 6. 
Therefore, p = 2 satisfies all three of the criterion’s properties, and we can conclude 
that all five polynomials are irreducible over Q. Observe that for the polynomial 

x*+ 6x? +6, we could also have applied Eisenstein’s Criterion with the 

prime p= 3. 


We still need to consider the polynomials x* + 3x7 +6 and x*+ 9x? +6. Note that in 
these cases we cannot apply Eisenstein’s Criterion with p = 2 as 2 does not divide the 
coefficient of x in either of these polynomials. However, for these two polynomials, the 
prime number p = 3 satisfies the three properties needed to apply Eisenstein’s Criterion. 
Therefore, these two polynomials are also irreducible over Q. 


It is certainly the case that if Eisenstein’s Criterion applies to a polynomial, then it must be 
irreducible over Q. But it is very important to keep in mind that the converse does not hold. 
In particular, if you are unable to apply Eisenstein’s Criterion to a polynomial, it does not 
mean that the polynomial is reducible over Q. If you are unable to apply Eisenstein’s 
Criterion, it simply means that you need to find some other test or technique to deal with 
that particular polynomial. For example, let us compare the two similar looking 
polynomials 


xo42x7-1 and x*+2x7+1. 


www.elsevierdirect.com 


396 Chapter 9 


Eisenstein’s Criterion does not apply to either of these polynomials. Whereas we have already 
shown that x* + 2x? — 1 is irreducible over Q, x+ + 2x? + 1 is reducible as 


Cte PaO + IG tha C+ l1y, 


Proof of Theorem 9.10. Let f(x) = anx" + dn—1x"~! +--+ aix+ao € Z[x] and suppose that 
p is a prime number such that p satisfies the three properties listed in the statement of the 
criterion. By way of contradiction, let us also suppose that f(x) is reducible over Q. Therefore, 
by Gauss’ Lemma, f(x) is also reducible over Z. As a result, there exist g(x), h(x) € Z[x] 
such that 


fx) = g(x) he), 
where deg(g(x)), deg(h(x)) <n. 
Observe that if b,x* and c;x' are the leading terms of, respectively, g(x) and h(x), then 
A, =bs-cy and O<s,t<n. 


In particular, since p / a, it follows that pb, and p {c;. Also note that if bp and cg are the 
constant terms of, respectively, g(x) and h(x), then ag = bg - co. 


As in the proof of Gauss’ Lemma, let 
p:Z[x] > Z,[x] 
be the homomorphism where 
P(Amx" ++ +++. a,x +0) = [an] px” +--+ + [ai] px+[aolp, 
for all a; € Z. Since p divides all the coefficients of f(x) other than a,, we have 
pfx) = [an] px". 
Using the fact that p is a homomorphism, we have 
P(F(X)) = (8) A(x) = (8): (A(x). 


By Lemma 9.3(c), since p(f(x)) consists of only a single term with a nonzero coefficient, the 
same must be true of p(g(x)) and p(h(x)). Having already shown that [,], 4 [0], and 
[cr] p [0], it now follows that 


p(g(x)) =[Ds]px* and p(h(x)) = [er] px". 


Since s > 0 and the coefficient of x* is the only nonzero coefficient in p(g(x)), it follows that 
[bol p = [O]p. Using the same reasoning, it also follows that [co] = [0]. As a result, p|bo and 


Polynomials over the Integers and Rationals 397 


P|co. But this immediately implies that p’|(bo- co). Since ay = bo - co, this results in the 
contradiction p”|ao, thereby proving the result. 


Exercises for Section 9.3 


In exercises 1-28, determine if the given polynomial is reducible in Q[x]. Remember, if a 
polynomial in Q[x] has degree 2 or 3, then the rational root test might not be the most efficient 
test available but it can always be used to determine if the polynomial is reducible. For 
polynomials in Q[x] of degree 4 and above, you can try to apply the rational root test or 
Eisenstein’s Criterion. However, both of these tests could be inconclusive. If so, it might be 
necessary to apply Gauss’ Lemma and try to solve a series of equations. For example, if you 
are trying to determine if a monic polynomial of degree 4 in Z[x] factors into a product of two 
quadratics, you could try to find a, b, c,d € Z such that your polynomial can be written as 

(x? +. ax +b) (x? +ex4d) = x*4+ (a+c)x3 4+ (ac+b+d)x? + (ad + be)x + bd. At this point, 
you would need to find integer solutions to four equations in the four unknowns a, b,c, d. 


(ae ane 

Re gf A 

3. xt-8 

Ae ae Og Da 

5 ae ee O 

6. xt4234+4x74+3x4+3 
7. xt4+33 49x? + 3x43 
8. xt 43x34+9x7 43x43 
9. 2x4 13x74 13x7+7x-—6 
10. x4-—9x7+14 
Lex? = 83" $14 

123, Sea 

13. 105x? —21x+429 
14. x3 —24x4+5 

15. x3 —25x+5 

16. x4+18x7+77 


398 Chapter 9 


17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 
25. 
26. 
27. 
28. 
29. 
30. 
31. 
32. 
33. 
34. 
35. 
36. 
37. 
38. 


x* + 18x? +78 

xt++4x?—x+6 

x4 44x? —2x+6 

a Aa 625 

x® —63 

x® — 64 

x® — 65 

x® — 8x3 — 65 

x® + 20x? +96 

x®+21x3 +96 

x4 4433 43x? +2x+1 

KP 

Find all integers A such that x4 — Ax? +2 is reducible in Q[x]. 
Find all integers A such that x* — Ax* — 2 is reducible in Q[x]. 
Find all integers A such that x4 — Ax? —1 is reducible in Q[x]. 
If p is a prime number, find all integers A such that ja Age a: p is reducible in Q[x]. 
If p is a prime number, find all integers A such that qe Ane p is reducible in Q[x]. 
Find all integers A such that x4 — Ax? + 10 is reducible in Q[x]. 
Find all integers A such that x* — Ax? — 10 is reducible in Q[x]. 
Find all integers A such that x+ — Ax* +15 is reducible in Q[x]. 
Find all integers A such that x* — Ax? — 15 is reducible in Q[x]. 


Find all integers A such that x* — Ax? —4 is reducible in Q{x]. 


9.4 Reduction Modulo p 


Let us now consider whether the polynomial f(x) = 273x* — 491x? + 935 is reducible 

over Q. Eisenstein’s Criterion does not apply to f(x) as the only primes which divide 935 are 
5, 11,17 and none of them divide —491. Since the degree of f(x) is only 3, we could apply 
the Rational Root Test as f(x) will be reducible over Q if and only if it has a rational root. 


Polynomials over the Integers and Rationals 399 


For this polynomial, the Rational Root Test provides us with 128 candidates, as every rational 
number of the form + var is a candidate, where a, b,c, d, e, f can take on any values 
from the set {0, 1}. However, as a practical matter, we might hope to find a test that is simpler 
to use in this case. In fact, the following test can make short work of this and many other 


polynomials. 


Theorem 9.11. Suppose f(x) = ayjx" +---+a,x+a9 € Z[x] and also suppose that p is a 
prime number such that p does not divide ay. Let p : Z[x] + Z,[x] be the homomorphism 
defined as 


P(AnX" +++++a,x +0) = [dn]px" +---+[ai]px + [ao] p. 


If p( f(x)) is irreducible in Z,[x], then f(x) is irreducible in Q|x]. 


Before proving Theorem 9.11, let us return to the polynomial f(x) = 273x* — 491x? + 935. If 
we consider the prime p = 2, observe that [273]2 = [1]2, [—491]2 = [1]2, and [935]2 = [1]>. 
Therefore, 


e(f(x)) = +27 +[Ub. 


Since the degree of x3 +x? + [1] is 3, it will be reducible over Z» if and only if it has a root in 
Zz. However, plugging both [0]. and [1]2 into e+2x7+[1 gives an answer of [1]2. Thus, 

x3 +x? +[1]» is irreducible over Zz, and Theorem 9.11 now asserts that 273x? — 491x? + 935 
is irreducible over Q. 


At this point, it is important to give you a warning similar to the one we discussed after 
Eisenstein’s Criterion. Certainly, if you can apply Theorem 9.11 to a polynomial, then the 
polynomial is irreducible over Q. However, the converse does not hold. If you try to apply 
Theorem 9.11 to a polynomial f(x) and the polynomial p( f(x)) is reducible in Z,[x], it does 
not tell you whether or not f(x) is reducible in Q[*x]. It simply means that you will need to 
find another technique or test to determine whether or not f(x) is reducible over Q. For 
example, consider the polynomials f(x) = x? +2 and g(x) = x* +3x+2. If we use the prime 
p =3, then 


p(f(x)) = 27 +[2h = p(g(x)) 


in Z3[x]. 


Therefore, although the polynomials f(x) and g(x) are quite different in Q[x], they become 
indistinguishable in Z3[x]. In this case, p( f(x)) and p(g(x)) are reducible in Z3[x] as 


e( f(x) = p(g(x)) =x? + [2]3 = @ +0113) (+ [2]3). 


400 Chapter 9 


However, the fact that o( f(x)) and p(g(x)) are reducible in Z3[x] gives us no information 
about the reducibility of f(x) and g(x) over Q. In fact, f(x) is irreducible over Q, as it has no 
rational roots, whereas g(x) is reducible and factors as g(x) = x7 +3x+2 = (x+1)(x+2). 


Proof of Theorem 9.11. Suppose f(x) = a,x” +--+ +a,;x+ao € Z[x] and also suppose that p 
is a prime number such that p does not divide a, and p( f(x)) is irreducible in Z,[x]. By way 
of contradiction, let us assume that f(x) is reducible over Q. Therefore, by Gauss’ Lemma, 
f(x) is also reducible over Z. As a result, there exist g(x), h(x) € Z[x] such that 


f(x) = g(x) h(x) and deg(g(x)), deg(h(x)) > 0. 


As in the proof of Theorem 9.10, since p does not divide the leading coefficient of f(x), it 
follows that p divides neither the leading coefficient of g(x) nor the leading coefficient of 
h(x). In light of this, deg(o(g(x))) = deg(g(x)) > 0 and deg(p(h(x))) = deg(h(x)) > 0. Since 
pis ahomomorphism, we have 


P(FX)) = p(g(x)- A(x) = p(g()) - ph). 


However, this says that p( f(x)) has been written in Z [x] as the product of two polynomials of 
degree greater than 0. This contradicts the fact that p( f(x)) is irreducible in Z,[x], thereby 
proving the result. 


m Examples 


Let us consider whether the polynomial f(x) = 36x? + 34x + 12 is reducible over Q. 
Observe that we cannot apply Eisenstein’s Criterion. The reason is that the only prime p 
such that p divides the constant term and p” does not divide the constant term is 3. 
However, we cannot use the prime 3 for two reasons. First, it divides the leading 
coefficient, and second, it doesn’t divide the coefficient of x. Since the degree of f(x) is 
only 3, we could use the Rational Root Test as f(x) will be reducible over Q if and only if 
it has a rational root. For this polynomial, it looks like the Rational Root Test might 
produce a large number of candidates. Therefore, we might try to find a simpler solution 
and only use the Rational Root Test as a last resort. If we try to apply Theorem 9.11, we 
cannot use the primes p = 2 or p = 3, since they are both divisors of the leading 
coefficient. Next, we look at what happens if we try the prime p = 5. In this case, 

[36]5 =[1]s, [34]5 = [4]s, and [12]5 = [2]5. As a result, we now have 


p( f(x)) = 2° + [4]5x+ [2]s. 


Theorem 9.11 tells us that if o( f(x)) is irreducible in Zs5[x], then f(x) is irreducible in 
Q[x]. But be aware that if o( f(x)) turns out to be reducible over Zs5, then we will have 


obtained no useful information, and we will need to go back to the drawing board. 


www-.elsevierdirect.com 


Polynomials over the Integers and Rationals 401 


Since the degree of x3 + [4]5x+ [2]s is only 3, it will be reducible over Zs if and only if it 
has a root in Zs. However, you can check that the values when we plug [0]s5, [1]s, [2]s. 
[3]5, [4]s into x° + [4]sx+ [2]s are, respectively, [2]5, [2]s, [3]s, [1]s, [2]s. Therefore, 
p(f(x)) is irreducible over Zs, and Theorem 9.11 asserts that f(x) is irreducible over Q. 


After looking at the examples in this chapter, it should be clear that whereas tests like the 
Rational Root Test and Eisenstein’s Criterion are extremely useful for determining when many 
polynomials are irreducible in Q[x], there are also many other polynomials for which these 
tests are inconclusive and give us no information. 


Similarly, Theorem 9.11 is useful for many polynomials and inconclusive for many others. 
This situation raises an interesting question: If f(x) € Z[x] is irreducible, does there exist a 
prime p such that p does not divide the leading coefficient of f(x) and p(f(x)) is irreducible in 
Zp? In other words, if f(x) € Z[x] is irreducible, is there some p out there that will enable us 
to apply Theorem 9.11? As we will soon see, the answer is no. Therefore, there exist 
polynomials in Z[x] that are irreducible over Q that are also reducible in Z,[x], for every 
prime p. These examples will once again indicate, in a very strong way, that if p(f(x)) is 
reducible in Z,,[x], we cannot draw any conclusions as to whether or not f(x) is reducible over 
Q. In order to produce these examples, we will need three lemmas. 


Lemma 9.12. Jf A is an integer then the polynomial f(x) = x* — Ax* +1 is reducible over Q 
if and only if either A+2 or A—2 is a perfect square in Z. 


Proof: If f(x) has a factor of degree 1, then it has a rational root. However, the Rational Root 
Test asserts that the only possible rational roots of f(x) are +1. Since f(1) = f(—1) =2-A, 
the only time f(x) has a factor of degree 1 is when 2— A = 0, so A = 2. Observe that when 
A =2, we have A—2 = 0 = 0”, so A —2 is a perfect square in Z. 


In light of the previous paragraph, it now suffices to show that f(x) factors into the product of 
two quadratic polynomials in Q[x] if and only if either A +2 or A —2 is a perfect square in Z. 
By Gauss’ Lemma, if g(x), (x) are quadratics in Q[x] such that f(x) = g(x)- h(x), then we 
may assume that g(x), h(x) € Z[x]. Since the constant term of f(x) is 1, the product of the 
constant terms of g(x) and h(x) is 1. Similarly, the product of the leading coefficients of g(x) 
and h(x) must also be 1. It is not hard to see that this implies that f(x) must factor as either 


Gr aa Gr + be+ 1) or Oa orbs 1), 
where a,b € Z. If f(x) = (x? +ax+1)(x*+bx +1), then we have 


xt— Ax? 41 = (x? +ax4+ DO? +bx4- 1) = x44 at bx? + (ab+2)x° + (atb)x+1. 


402 Chapter9 


Comparing the coefficients of the various terms, this tells us that a and b must satisfy the 
equations 


a+b=0 and ab+2=-A. 


But these equations imply that b = —a and a” = A+2. Therefore, if f(x) factors in this way, 
then there exist a, b € Z satisfying these equations and A +2 is a perfect square in Z. In the 
opposite direction, if a € Z such that A +2 = a’, then the preceding calculations indicate that 


fe) = x4 — Av? 41 = x4 - (0? —2)x? 4:15 (0? +ax41)(x* -—ax4+1) 
and f(x) is the product of two quadratics. 
Similarly, if f(x) = (x? + ax —1)(x* +bx — 1), we now have 


x4 Ax? +1 = (x? tax— 1)? +bx—1) = xt 4+ (a+ b)x + (ab—2)x? — (a+b)x+1. 


If we again compare the coefficients of the various terms, we see that a and b must satisfy the 
equations 


a+b=0 and ab—2=-A. 


These equations imply that b = —a and a? = A —2. Thus, if f(x) factors in this way, then there 
exist a, b € Z satisfying these equations and A — 2 is a perfect square in Z. Going in the other 
direction, if a € Z such that A — 2 = a’, then our previous calculations show that 


fx) = x4 — Ax? 41 = 34 -( 4:2)? 41 = (0? +ax—- 1)? —ax-1) 


and f(x) is the product of two quadratics. 


m Examples 


In the next several examples, we will apply Lemma 9.12 to polynomials of the form 

x* — Ax? +1, for various values of A. 

1. If A=4, then x4—4x? +1 is irreducible over Q as A+2=6and A—2=2are not 
perfect squares in Z. 


2. If A=3, then x*— 3x? +1 is reducible over Q as A—2 = 1 = 1”. Looking back at the 
proof of Lemma 9.11, x4 — 3x7 +1 = (x? +ax—1)(x* +bx—1), where b= —a and 
a’ = A—2. Therefore, a= 1, b=-—land 


3x +1 = (0? += 1)G" —x—1). 


www-.elsevierdirect.com 


Polynomials over the Integers and Rationals 403 


3. If A=2, then x4—2x*+1 is reducible over Q as A—2 =0 = 0”. In this case, 1 and 
—1 are roots of x* —2x*+ 1 and we have 


x4 2x? 41 = (e+ D04+D0—-DO-)=04+1)? 0-1)”. 


4. If A=1, then x4—x?+4+1 is irreducible over Q as A+2=3 and A—2=~—1 are not 
perfect squares in Z. 


5. If A=0, then x4+1 is irreducible over Qas A+2 =2 and A—2=-—2are not 
perfect squares in Z. 


6. If A=—1, then x*+x*+1 is reducible over Q as A+2=1 = 17. Looking back at the 
proof of Lemma 9.12, xt4ex? +1 = (x? +ax4+ 1)(x? +bx+ 1), where b = —a and 
a’ = A+2. Therefore, a=1,b=-—land 


4741 = i 43e41)G" —9-1), 


7. If A= —2, then x*+2x?+1 is reducible over Q as A+ 2 =0 = 0°. The proof of 
Lemma 9.12 tells us that x* + 2x7 + 1 = (x7 +ax+1)(x?+bx-+ 1), where b= —a and 
a’ = A+2. Therefore, a=0, b=0O and 


x4 427 41= (07 +1D074+1) = (07 4+1). 


8. If A <—2, then x*— Ax? +1 is always irreducible over Q as A+2 and A—2are both 
negative and therefore cannot be perfect squares in Z. 


Recall that we usually use the symbol 1 to denote the multiplicative identity element of a field. 
Going one step further, we often use the symbol 2 to represent the sum 1 + 1. However, 
observe that in the field Z2, we have [1]2+[1]2 = [2]2 = [O]2. Therefore, there exist fields 
where the symbol 2 is equal to the additive identity. Thus, depending on the field in which you 
are working, the element 2 might not have a multiplicative inverse. 


Lemma 9.13. Let F be a field where 2 has a multiplicative inverse. If A € F and if either 
A+2, A—2, or A* —4 is the square of an element in F, then the polynomial x° — Ax’ +1 is 
reducible in F[x]. 


Proof. Let us first consider the case where A + 2 is the square of an element in F. In this case, 
we let ./A +2 denote an element of F whose square is A +2 and observe that 
(+ (VA+42)x+ DQ? -(WAF2)441) = 
x4 + (AF2—VA42)x3 + (—W/AF2)? +2)x+ VA42— VAF2)241= 
e Axe +1. 


www.elsevierdirect.com 


404 Chapter 9 


Therefore, in this case, we have factored x4 — Ax? + | into the product of two quadratics 
in F[x]. 


In the next case, suppose A — 2 is the square of an element in F and let ./ A — 2 denote an 
element of F whose square is A — 2. We now have 


(x? + (WA —2)x — 1)? — (WA —2)x- 1) = 
AS oy) AD CW RA SW AS Se AS 
eS Age: 
Once again, we have factored x+ — Ax” + 1 into the product of two quadratics in F[x]. 


For the final case, suppose that A? — 4 is the square of an element in F and let / A? — 4 denote 
an element of F whose square is A? — 4. This is the only case where we will need that 2 has a 
multiplicative inverse. In this case, we have 


(r+ 23) ("4 ~A—~J/R2 =) _ 


2 2 
4 —-A+VA*-4 -A-WVA?-4)\ , 
x e+ 5) ot 5) x+ 


(ty o*) (ye) Z 
2 2 7 


eS Ax 41, 


Thus, once again, we have factored x+ — Ax? + 1 into the product of two quadratics in F[x]. 


Lemmas 9.12 and 9.13 indicate that the reducibility of polynomials of the form x* — Ax? +1 is 
related to various elements being squares of other elements. To apply these results, we will 
need to examine the squares of elements in Zp. 


Lemma 9.14. Ifa, b € Z and if p is a prime number, then at least one of [a]p, [b] p, [a- b] p is 
the square of an element of Zp. 


Proof. If either [a], or [b], is equal to [0],, then there is nothing to prove. Therefore, without 
loss of generality, we may consider the case where [a], [Db], and [a-b], are all elements of 
U(Zp). 


2 


We begin by supposing that c,d € Z such that 1 <c,d< Lae and we claim that if [cl = [d];,, 


then c = d. To see this, observe that if leis = [d}-, then [c?] p= [d7] p» which implies that 


Polynomials over the Integers and Rationals 405 
[c? — d*], = [0],. Thus, 
[0], = [c? —d?], =[e—d]p-[c+d]p. 
Since Z, is a field, this tells us that either 
[c—d]p=[0]p or [e+d]p=[0]p. 
Therefore, either p divides c—d or p divides c+d. 


Since 1 <c,d< Bo we see that both c—d and c+ d must lie between p and — p. Therefore, 
the only way c—d or c+d could be divisible by p is to be equal to 0. But since c and d are 
both positive, it is impossible for c+ d to equal 0. Thus, the only remaining possibility is that 


c—d=0,soc=d. 


The preceding argument tells us that all a elements of the list 


ne, 27 eo] 
ae en 


are different. Now suppose that [a], and [b], are both elements of U(Z,) that are not squares 
of elements of U(Z,). We claim that the new list 

p-1]° p-1] 

hee emer | [lp (ile laleel21;2seslalp* =| 
P P 
is a complete listing of the p—1 elements of U(Z,). Since U(Z,) has exactly p— 1 elements, 
we only need to show that all p— 1 elements of our list are different. Earlier in this proof, we 
already showed that the first po elements of the list are all different from one another. In 
addition, a now familiar argument, already used in the proofs of Theorem 6.8 and 
Proposition 7.18, tells us that since we are dealing with elements of a group, the last ao 
elements on our list are also all different from one another. Therefore, it only remains to show 
p-l 


that none of the first ao elements of our list can be equal to any of the last =— elements of 


our list. By way of contradiction, suppose there exist [uv], [v]» € U(Z,) such that 
[a] -[u}, = [uv]; 
If we let [w], be the multiplicative inverse of [uv], then, using the previous equation, we have 
[alp = [alp- (115, = lal: Culp: [wlp)? = Calp- (ul) - [wl = 
[v}), [wh = (v]p-[w]p)” = [v- wl). 


But this contradicts the fact that [a], is not the square of an element in U(Z,). Thus, all p—1 
elements on our list are different from each other, and every element of U(Z,) appears exactly 
once on the list. 


406 Chapter 9 


Since we are assuming that [b], is not the square of any element of U(Z,), [b], must appear 
on our list as one of the elements of the form [a] - [x], for some [x] p € U(Z,). Using the fact 
that [b], = [a]- [x];,, we have 


[a- bly = [alp- [bly = [al Cal - Lx])) = Calp- Lely)? = [a xJ5. 


Therefore, in this case, [a- b], is indeed the square of an element of U(Z,). Thus, it is always 
the case that at least one of [a], [b]p, [a- b]p is the square of an element of Zp. 


We can now use Lemmas 9.12, 9.13, and 9.14 to produce an infinite number of polynomials in 
Z[x] that are irreducible over Q but become reducible when we reduce down to Z,, for every 
prime p. 


Theorem 9.15. Jf A is an integer then, for every prime number p, the polynomial 
x*—[A]px? +[1]p 
is reducible over Zp. However, the polynomial 
xt— Ax? +1 
is irreducible over Q, except when either A+2 or A—2 is a perfect square in Z. 


Proof. Lemma 9.12 told us that if A is an integer then x* — Ax” +1 is irreducible over Q, 
except when either A +2 or A —2 is a perfect square in Z. Therefore, to prove our result, we 
only need to consider the situation over Z,. As is often the case when studying Zp, we will 
need to examine the p = 2 case separately. 


In Zo, [0]5 = [0]2 and [115 = [1]. Therefore, for any integer A, [A]5 = [A]2. Combining this 
with the fact that [A]2 = —[A]2, we have 


(x? + [A]o + [1]2)? = @? +[A]lo t [1]2) (0? + [Ah +[12) = 
x4 + ([A]o + [A]2)x° + (LAB + [12 + [1]2)x? + ([Alo + [Ala x + 1h = 
x4 + [A]ox? + [Lo = x4 — [A]ox* + [Lh. 


Thus, x4 — [A]ox* +[1]» is reducible over Zp. 


Next, if A is an integer and p is a prime number other than 2, let a= [A], + [2], and 
b=[A]> —[2]p. Therefore, 


a-b= ((A]p+[2]p)- (Alp —[2]p) = [A], — [4] p- 


Polynomials over the Integers and Rationals 407 


Applying Lemma 9.14, it follows that at least one of [A], +[2],, [Alp —[2]p. [A]; — [4], is 
the square of an element of Z,. Furthermore, since p # 2, 2 has a multiplicative inverse in Zp. 
As a result, we can apply Lemma 9.13 to conclude that x+ — [A] Bs + [1]p is reducible 
over Zp. 


m Examples 


As we indicated in the examples that followed Lemma 9.12, if A is an integer that is less 
than —2, then x+ — Ax* +1 is irreducible over Q. Thus, all polynomials of the form 


a ag xo +4x7 +1, xf 4 5x71, x6 +6x7 +1, 
Se HA, x 483 41... 


have the property that they are irreducible over Q, but Theorem 9.11 does not apply, as 
they become reducible whenever we look at them over Zp, for all primes p. 


We conclude this chapter with an example of a polynomial that appears to be beyond the tools 
and techniques we have developed so far. However, we will show how to adapt our tools to 
handle this and some other polynomials. To this end, suppose f(x) € Q[x] is reducible over Q. 
Therefore, there exist g(x), h(x) € Q[x] such that 


f(x) = g(x)-A(x), where deg(g(x)), deg(h(x)) < deg(f(x)). 
Now, replace x by t+ 1 and let F() = f(t+1), GO) = g(t+ 1), and A(A) = A(t+ 1). The fact 
that f(x) = g(x)-h(x) now tells us that F(t) = G(t)- H(t). Since G(f) and g(x) have the same 
degree, as do H(t) and h(x), we see that F(f) is also reducible over Q. Looking at this from 


another perspective, it tells us that if F(A) is irreducible over Q, then so is f(x). We now apply 
this observation. 


m Examples 


Let f(x) = ge? 42x? eg 1. At first glance, it is unclear as to whether f(x) is reducible 
over Q. However, if we let x =t+1, we have 


a ear a1. Geli 
fx) Hx tx 4x°4x4+1= =e G4)—1 = 
CASA AMPLE ASI) PP 45F 4 10r + 1r S 
t ~ t ~ 
f45f +10 +10t-+5 = FD. 


www.elsevierdirect.com 


408 Chapter 9 


We can use Eisenstein’s Criterion with the prime p = 5 to conclude that F(t) is irreducible 
over Q. Thus f(x) is also irreducible over Q. In the exercises, we will extend this argument 
to all polynomials of the form x?~! 4 x?-? +---+x+1, where pis a prime. 


Exercises for Section 9.4 


In exercises 1-27, use any of the techniques in this section to determine if the given 
polynomial is reducible in Q[x]. These exercises should indicate that two polynomials can 
appear to be very similar and have graphs that are virtually identical, but one can be reducible 
and the other irreducible. 


1. a? as? 3-46 


9, S44 He+5 
3. x4+4x27+%x4+6 
4, x444x? 4x47 
5. x4+4x°+2x+6 
6, 3x°=4a7 = 2144-28 
7. 39° 7x? =21 9428 
8. 3x*—7x3 — 127428 
9, 3x°=7x" — 14e+28 
10. 3x4—5x3-—9x4+15 
11, 3x*=65° 9x4 15 
12, Sx =—5x*—10e 415 
13, x 42427 —4 
14, 9° +39? 42%=3 
1S.. 27 45257 22 
16. x++2x3-2x—-4 
i a = 24 =5 
18. 2x4=3x7 + 10x 15 


www-.elsevierdirect.com 


19. 
20. 
21. 
22. 
23. 
24. 
25. 
26. 
27. 
28. 
29. 


30. 


Polynomials over the Integers and Rationals 409 


2x* — 3x3 +9x — 15 

2x*t — 5x3 + 10x — 15 

2x* — 5x3 + 6x — 15 

2x4 — 6x3 + 6x —15 

2x4 — 5x3 +5x—-15 

8x° +. 4x° + 6x + 20 

8x? — 10x* — 16x+20 

8x3 — 6x? — 21x +20 

8x> — 5x” — 20x + 20 

How many monic irreducible polynomials of degree 1 are there in Z,[x]? 


How many monic irreducible polynomials of degree 2 are there in Z,[x]? It might be 
easier if you first count the number of monic reducible polynomials of degree 2. 


How many monic irreducible polynomials of degree 3 are there in Z,[x]? It might be 
easier if you first count the number of monic reducible polynomials of degree 3. 


For exercises 31-32, please read the following: 


A nonzero polynomial f(x) € Z[x] is called primitive if the greatest common divisor of the 
coefficients of f(x) is 1. 


31. 


32. 


33. 


Suppose f(x), g(x), h(x) € Z[x] such that f(x) = g(x)-h(x). Show that f(x) is primitive 
if and only if both g(x) and h(x) are primitive. For various primes p, you might want to 
think about the ring homomorphisms we have examined that send Z[x] to Zp[x]. 


Suppose f(x), g(x) € Z[x] and h(x) € Q[x] such that f(x) = g(x) - A(x). Show that if 
g(x) is primitive, then h(x) € Z[x]. 


Let p be a prime number and let f(x) = x?~! 4+. x?-? +---+x?+x-+1. In doing this 

exercise, you might want to refer to exercise 36 from Section 9.1. 

(a) Show that f(x) = ==. 

(b) If we let F(x) = f(x+ 1), show that F(x) = x?~! + (2) xP? + (2)xP 3 40+ 
(23) + (,29)x+ P- 

(c) Prove that F(x) is irreducible in Q[x]. 


(d) Conclude that f(x) is irreducible in Q[x]. 


410 Chapter 9 


In exercises 34-42, express the given polynomial as a product of monic irreducible 
polynomials in Q[x], R[x], and C[x]. If necessary, elements of C can be expressed in polar 


form. 

34, x° ST, 
5 x 
36. x1. 
87 xls 
38, x= 1, 
39, x41. 
40, x®—-1, 
41, xl0_], 
42, x!2_-], 


www-.elsevierdirect.com 


Roots of Polynomials of Degree 
Less than 5 


In Chapter 1, we mentioned that there exist formulas for the roots of all polynomials of degree 
at most 4. These formulas involve combining the polynomial’s coefficients using only 
addition, subtraction, multiplication, division, and taking nth roots, for n > 2. In this rather 
brief chapter, we will derive the formulas for the roots of all polynomials of degree at most 4. 
More precisely, we will provide a series of steps and procedures that will lead us to the roots of 
these polynomials by manipulating the coefficients using only the operations just listed. We 
will not explicitly write out the final versions of these formulas as that would take many pages. 
But more important than writing out extremely long formulas, we provide algorithms that 
prove that one can indeed find the roots of any polynomial of degree at most 4 by doing 
nothing more than combining the polynomial’s coefficients using addition, subtraction, 
multiplication, division, and taking nth roots, for n > 2. 


Galois proved that analogous formulas cannot exist for polynomials of degree 5 and above. In 
Chapter 17, we will prove Galois’ famous result on the insolvability of the quintic. Although 
we do not yet have enough mathematical machinery to prove this result, we are in a position to 
apply his result. Therefore, we will conclude this chapter with examples of fifth-degree 
polynomials whose roots cannot be found by adding, subtracting, multiplying, dividing, and 
taking nth roots of combinations of the coefficients. 


10.1 Finding Roots of Polynomials of Small Degree 


The strategy for finding formulas for the roots of a polynomial of small degree will be to make 
a series of substitutions, each of which will simplify the polynomial into a new polynomial 
with simpler coefficients. In some ways, this will be analogous to integration problems in 
calculus where you made several substitutions before finally obtaining a problem that could be 
easily handled. Here is an outline of our strategy: 


Step I Divide the polynomial by its leading coefficient so that the new leading coefficient 
is equal to 1. 


Copyright © 2010 by Elsevier Inc. All rights reserved. 4 1 1 


412 Chapter 10 


Step II Make a substitution so that the new coefficient immediately following the leading 
coefficient is equal to 0. 


Step III Make a substitution so that the new trailing coefficient is equal to 1. 


For polynomials of degree 1, performing Step I will be enough for us to easily find the root. In 
the degree 2 case, we will be able to find the roots after performing Steps I and II. At that 
point, we will have succeeded in deriving familiar formulas that you have undoubtedly come 
across in your earlier algebra courses. Not surprisingly, the situation for polynomials of degree 
3 will be more difficult. Not only will we need to apply Steps I, II, and III, but we will also 
need to make a substitution that appears to come out of nowhere. The degree 4 case will 
involve even more computations, but in some ways, these computations will be more natural 
and motivated than those required to complete the degree 3 case. 


Roots of Polynomials of Degree 1 
Given 
ax+b=0 
with a £0, apply Step I and divide both sides of the equation by a to obtain 
x+ : = 0. 
a 


Then subtract b from both sides to obtain the root 


b 
=e 
a 
Roots of Polynomials of Degree 2 
Given 
ax’ +bx+c=0 


with a £0, apply Step I and divide both sides of the equation by a to obtain 
b 
CeO. 
a a 


Next, to apply Step II, make the substitution x = y — Z. Note that if we can find y, then we 
can certainly find x, thus it suffices to find y. Our previous equation now becomes 


(sz) +3 (0-2) +5=° 
y 2a a m 2a Gc 


Roots of Polynomials of Degree Less than5 413 


Expanding out the terms in this equation, we obtain 


2 ta ae 
DG ae ge Oe ag 
which simplifies to 
4ac — b? 
24 A =0. 


4a? 


pe , ; 
We then subtract such from both sides to obtain 


2 b* —4ac 

ayer 
Taking square roots is an allowable operation and when we take the square root of both sides 
we obtain 


Vb? — 4ac 


= 
2 2a 


Having successfully solved for y, we can now go back to the substitution x = y — £ and see 
that 


—b+V/b? —4ac 
x= ; 
2a 


This is, of course, the familiar quadratic formula. 


In your previous algebra courses, the techniques used to derive the quadratic formula might 
have been simpler than the procedure just outlined. The reason for this is that we are not 
necessarily looking for the simplest technique in the degree 2 case but are instead looking 

for a technique that can be easily generalized and extended to help us study polynomials of 
degree 3 and 4. The key step in the previous procedure was making the substitution 

x=y- Z. Often this step is referred to as “completing the square.” In order to find the roots 
of polynomials of degree 3 and 4, we will need to generalize this substitution to those 
situations. In other words, in the degree 3 case, we will need to find an @ such that substituting 
y—a for x will result in a polynomial in y with no degree 2 term. Let us examine what value 
of a will do the trick. If we begin with 


ax’ +bx’ +ex+d 
and let x = y—a, then we obtain 


a(y—a@)> + b(y—a@)? +c(y —a) +d = ay’ + (—3aa +b)y” + terms of lower degree. 


414 Chapter 10 


Therefore, for the coefficient of y? to be 0, we need —3aa +b = 0. As a result, a must be equal 


to Z. ee in order to apply Step I in the degree 3 case, we will make the substitution 


X= \V— 3G: 


Roots of Polynomials of Degree 3 
Given 
ax> +bx°+cx+d=0 


with a 4 0, apply Step I and divide both sides of the equation by a to obtain 
d 


b c 
w+ =x°+-x4+—=0. 

a a a 
To apply Step II, we now make the substitution x = y — Z. Once again, if we can find y, then 
we can certainly find x, so it suffices to find y. You should check that our previous equation 


now becomes 


—b? +3ac | 2b*—9abe+27a°d _ 9 


3 
ae 3a? re 27a3 
To simplify matters, we will let B= =bitSac and C = ae a Our previous equation 
now simplifies to 
y+ By+C=0. 


If C = 0, then we can factor and obtain 
O= y+ By=y(y’+B). 


Therefore, one of the roots is 0 and the others can be found by applying the quadratic 

formula to y* + B = 0. As a result, we will assume that C ¢ 0 and, to apply Step III, we will 
make the substitution y = C 32, Note that if we can solve for z, then we will know the value of 
y, SO it suffices to solve for z. Our substitution turns the equation y* + By+C =0 into 


O=(Ci234+ B(C32) +C=C34 BCIZLC. 


Since C 4 0, we can divide this equation by C to obtain 
,, B 

ile peg Sas al a0 
C3 


We can now further simplify this equation by letting D = 4, thereby giving us the equation 
C3 


2+Dz+1=0. 


Roots of Polynomials of Degree Less than5 415 


This is as far as Steps I, I, and III can take us. Therefore, in order to finish this problem, we 
will need another substitution or idea. There is a substitution that works at this point, but, 
unfortunately, it is rather unmotivated and appears to come out of nowhere. However, it does 
get the job done, and that is the most important thing. We let z = v — 2 and once again note 
that if we can find v, then we will know the value of z. Thus, it suffices to find v. Our previous 


equation now becomes 


0 BY is55 a +1 ee e +{D PEN 
= _—_— v-— = = Uv —_— -_ A 
i 3v 3u 3v. 273 t 3v 


which simplifies to 


3 


2703 


3 
Uv 


+1=0. 


Multiplying this equation by v* and then letting E = ae gives us 
v+yv+E=0. 


Finally, let v = w3; if we can solve for w, then we can certainly find v. With this substitution, 
the previous equation now becomes 


(w!)°+(w!)' +E =0, 
which immediately simplifies to 
w’+w+E=0. 


At various points in this procedure, we have taken cube roots, which is an allowable operation. 
We can now use the quadratic formula to find w. Knowing w, we can then, in order, find 

v, Z, y, and then x. The actual expression for x in terms of a, b, c, d is quite long and 
complicated. However, at this point, we are much more interested in the fact that such an 
expression actually exists. Thus, we have succeeded in showing that there is indeed a formula 
for the roots of ax? + bx? + cx +d, which only involves combining a, b, c, d using addition, 
subtraction, multiplication, division, taking square roots, and taking cube roots. 


Roots of Polynomials of Degree 4 
Given 
ax’ + bx? +ex*+dx+e=0 


with a £0, apply Step I and divide the equation by a to obtain 


b d 
Pode RESET SA A Reel Sa 
a a a a 


416 Chapter 10 


In light of our discussion before the degree 3 case, it should now come as no surprise that to 
apply Step II, we make the substitution x = y — Z. As before, in order to find x, it suffices 
to find y. Our equation now becomes 


b cle b re: b “4 acer, 
y~ Aa ra Caer aXe da a\” 4a a 


If we expand out the terms in the previous equation, we obtain 


bb 
ae (-4 —-+ 7) y+ terms of lower degree = 0. 
4a a 
Therefore, this equation can be written as 
4 2 _ 
y +By’+Cy+D=0, 


where B, C, D are obtained from a, b, c, d, e using only addition, subtraction, multiplication, 
and division. If D = 0, then we can factor and obtain 


y(y + By+C) =0. 


Thus, y = 0 is one of the roots, and the other three can be found by applying our work on cubic 
polynomials to y> + By+ C. As a result, we may assume that D 4 0 and can apply Step III, by 
letting y = Diz, to obtain 


0 = (D2z)* + B(D1z)? + C(D1z) + D = Dz + BD?224+ CD?z+D. 


Dividing this equation by D gives us 
B Cc 
a+? +z 4+1=0. 
D2 D4 


We can simplify further by letting E = 2 and F = S to obtain 
D2 D4 
4 2 = 
+E + ret 1=0. 
If F =0, then the substitution z = f2 turns our equation into 


P+Ett+1=0. 


Since the roots of this polynomial can be found using the quadratic formula, we would be done 
in this case. Therefore, we may now assume that F 4 0. 


This is as far as we can go using only Steps I, II, and III. However, the idea we will use to 
finish this problem is much more natural than the one used at the comparable point in 


Roots of Polynomials of Degree Less than5 417 


our study of the degree 3 case. The idea is to try to factor z+ + Ez? + Fz +1 into two quadratic 
polynomials as 


CHEC +Fet1 = (2 +az+ Bye? +yz4+6), 


where a, 6, y, 6 can be obtained from EF and F using the usual operations of addition, 
subtraction, multiplication, division, and taking nth roots, for various n > 2. If we are 
successful in factoring z+ + Ez* + Fz +1, then we can easily find all of its roots by using the 
quadratic formula to find the roots of both z7 ++az+ B and z74+ yz+6. 


Observe that 
(2° +az+ B)(2? + yz+5) = 24+ (at yz + (B+5+ay)z’ + (ad + By)z + Bo. 


By comparing the coefficients of like terms, the above product will be equal to 
4+ E+ Fez+1 precisely if we can find a, B, y, 6 such that 


(1) at+ty=0, B+éd+ayvy=E, ad+fBy=F, and ps=1. 
It is clear that the first and fourth equations from (1) tell us that 
y=-—a and 6=£"!. 
As a result, the second and third equations from (1) can now be simplified to 
B+pB '—o’ =E and a(B '!-f)=F. 


Since we are in the case where F # 0, it now follows that a 4 0. Therefore, we are allowed to 
divide the second of the preceding two equations by aq, and it is easy to see that it now suffices 
to find a, B such that 

-1 2 -1 F 
(2) B+B  =E+a’* and 6B a Pome 
If we add the second equation from (2) to the first and then also subtract the second equation 
from the first, we obtain 


(3) op) =(E+02)4~5 and ae ¢ ar ee 
a a 


If we multiply the two equations in (3), we see that 
F\2 
4=(E+a’)— (=) . 
a 


Therefore, it now suffices to find an @ satisfying this equation, since plugging that value of a 
into the second equation in (3) will immediately tell us the value of 6. 


418 Chapter 10 


You should check that if we expand the terms from our previous equation, multiply both sides 
by a, and then collect like terms, we obtain 


a® + 2 Eat + (E* — 4)? — F? = 0. 
Finally, if we make the substitution a = 62, then the previous equation becomes 
(02)° + 2E(62)* + (E2 —4) (62)? — F? =0. 
But this immediately simplifies to 
0° +20" +(E? —4)0— F* =0. 


However, this is now a cubic polynomial in the variable 6. Thus, using our work on cubic 
polynomials, we can find 6. Working backward, knowing 0, we can then find, in order, a, 6, y, 
6, Z, y, and then x. At every point along the way, the only operations we have used are 
addition, subtraction, multiplication, division, and taking nth roots for various n > 2. Thus, the 
roots of ax* + bx3 + cx* + dx +e are indeed obtained from a, b,c, d, e using only addition, 
subtraction, multiplication, division, and taking nth roots, for various n > 2. Admittedly, if we 
wrote out the final formula for the roots of ax* + bx? + cx? + dx +e, it would be incredibly 
long and complicated. In fact, it would probably take several pages. But the more important 
point is that we have shown, in a fairly straightforward way, that it is indeed possible to 

find the roots of all polynomials of degree at most 4 by combining the coefficients using only 
addition, subtraction, multiplication, division, and taking nth roots, for various n > 2. 


10.2 A Brief Look at Some Consequences 
of Galois’ Work 


If we wanted to try to find a formula for the roots of polynomials of degree 5, it would be 
logical to once again begin by applying Steps I, I, and III. 


Given 
ax? + bx4 +ex3 + dx? +ex+ f =0 
with a £ 0, divide the equation by a to obtain 


f 


b d 
Ps ee a, 
a a a a a 


Next, make the substitution x = y — 2. After simplifying we obtain 


y+ By? +Cy*+ Dy+E=0, 


Roots of Polynomials of Degree Less than5 419 


where B, C, D, E are obtained from a, b, c,d, e, f using only addition, subtraction, 
multiplication, and division. If E = 0, then we can factor and obtain 


y(y* + By? +Cy+D) =0. 


Thus, y = 0 is one of the roots, and the other four can be found by applying our work on 
quartic polynomials to y+ + By? + Cy+ D. Asa result, we may assume E + 0, and if we let 
y=E 5z and then simplify, we obtain 


D+F+4+G2+Hz+1=0, 


where F, G, H are obtained from a, b,c, d, e, f using only addition, subtraction, 
multiplication, division, and taking fifth roots. 


Unfortunately, this is a far as Steps I, II, III can take us. Needless to say, many people worked 
long and hard for many years trying to factor or simplify the polynomial z> + Fy? + Gy? + 
Hy + 1 in an attempt to find its roots. But Galois showed that there is no formula for the roots 
of polynomials of degree 5 that combines the coefficients using only addition, subtraction, 
multiplication, division, and taking nth roots, for various n. 


Certainly, there are many polynomials of degree 5 and above whose roots can be found by 
combining the coefficients using only addition, subtraction, multiplication, division, and 
taking nth roots. For example, using our study of the complex numbers from Chapter 6, we 
know how to find the five roots of the polynomial x° — 1. We can also find the roots of the 
polynomial x° + 5x? + 5x? + 6x +1, since it factors as 


45x 4-5x7 + 6x41 = (x? 444 DQ? — 27 4+5x41). 


More generally, you can check that we can find the roots of any polynomial of the form 
x° + Bx? + Bx? +(B+1)x+1, since it factors as 


PERO + Be HB Dee laH OC 4e se DO Ke tees 1. 


These examples in no way contradict Galois’ work. Indeed, there are many polynomials of 
degree 5 whose roots can be found by combining the coefficients using the preceding 
operations listing. However, Galois’s work shows that there exist some polynomials of degree 
5 where it is impossible to find their roots by combining the coefficients in the preceding ways. 
More precisely, Galois proved that if a polynomial satisfies certain properties, then there does 
not exist a formula for its roots that combines its coefficients using only addition, subtraction, 
multiplication, division, and taking nth roots, for various n. 


At this point, it is worth looking at an important special case of Galois’s work, which we will 
prove in Chapter 17. 


Theorem 10.1—A Special Case of Galois’ Work. Let f(x) € Z[x] such that 


420 Chapter 10 


(i) f(x) has degree 5, 
(ii) f(x) is irreducible in Q|x], and 
(ii) f(x) has exactly three real roots. 


Then there does not exist a formula for the roots of f(x) that only involves combinations of the 
coefficients using addition, subtraction, multiplication, division, and taking nth roots, for 
various n > 2. 


By using Eisenstein’s Criterion, the Intermediate Value Theorem, and Rolle’s Theorem, it 
will be easy to find many degree 5 polynomials with these properties. Recall that the 
Intermediate Value Theorem asserts that if a polynomial changes sign three times, then it has 
at least three real roots. On the other hand, Rolle’s Theorem asserts that if the derivative of a 
polynomial is equal to zero at only two points, then the original polynomial cannot have more 
than three real roots. 


Let us now consider the polynomials 


x —6x+3, x —4x4+2, x —8x+6. 


We will show that all three satisfy the properties that allow us to apply Galois’ work. First, 
observe that Eisenstein’s Criterion implies that all three polynomials are irreducible in Q[x]. 
Next, if you look at the values of all three polynomials at x = —2, —1,0, 1,2, then you can 
see that they all change sign three times, so the Intermediate Value Theorem tells us that they 
must all have at least three real roots. On the other hand, the derivatives of these three 
polynomials are all of the form 5x* — a, where @ is a positive real number. It is easy to see that 
these derivatives are equal to 0 exactly twice, once when x is positive and once when x is 
negative. Therefore, Rolle’s Theorem tells us that all three polynomials each have at most 
three real roots. Combining all of the preceding observations, we can see that Galois’ work can 
indeed by applied to all three of these polynomials. Therefore, for all three of these 
polynomials, it is impossible to find its roots by combining the coefficients using only addition, 
subtraction, multiplication, division, and taking nth roots, for various n. Hopefully, you can 
see that it is not difficult to find many degree 5 polynomials to which Galois’ work can be 
applied. 


Exercises for Sections 10.1 and 10.2 


In exercises 1—14, find all the roots in C of the given polynomial. In each case, you may first 
want to apply Steps I and II from this chapter to simplify the polynomial. You may leave your 
answers in the form acis(@) + b, where a, b,0 € R, a > 0, and 0 < 6 < 27. 


— 


= Se 
ae 


14. 


SOP Ee OOe Se! SO AN ae RO; eS 


Roots of Polynomials of Degree Less than5 421 


x +6x?2+12x+1 


0° = 9x2 427% =17 
8x> — 12x? + 6x — 201 
125x3 + 225x? + 135x +12 


x4 +423 + 6x? +4x—5 

x4 — 20x + 150x? — 500x +614 

16x* — 96x? + 216x? — 216x + 60 

81x4 + 108x3 +54x? + 12x — 49 

PP 5x4 100 = 10x74 5x4 16 

x° + 10x* +40x3 + 80x? + 80x —1 

243x° — 405x4 + 270x3 — 90x? + 15x — 13 

32x° + 240x4 + 720x* + 1080x7 + 810x + 143 

x® + 6x? + 15x4 + 20x3 + 15x? + 6x — 1950 

64x® — 192x° + 240x* — 160x? + 60x? — 12x — 1954 


In exercises 15-19, show that the polynomial satisfies the properties required to apply 
Theorem 10.1. As a result, we can then conclude that there is no formula for the roots of the 
polynomial that combines the coefficients using only addition, subtraction, multiplication, 
division, and taking nth roots, for various n > 2. 


3x° — 5x3 — 30x +5 
3x° — 5x3 — 90x — 15 


15. 
16. 
17. 
18. 
19. 
20. 


x — 10x? — 80x +6 


x — 85x + 34 


x —14x+7 


In this exercise, we will generalize exercise 19. 


(a) 


(b) 


Show that for any prime p, the polynomial x° — 2px + p satisfies the properties 
required to apply Theorem 9.1. 


Conclude that there are an infinite number of polynomials of degree 5 for which 
there is no formula for the roots of the polynomial that combines the coefficients 
using only addition, subtraction, multiplication, division, and taking nth roots, for 
various n > 2. 


422 Chapter 10 


In exercises 21-26, factor the polynomial into a product of two quadratic polynomials in R[x]. 
Then use this factorization to find all four roots in C of the original polynomial. 


21. x4+10x7 +23x7410x+1 
D2) G30 1 $3 

23. x4+12x3 — 6x? — 84x —7 
24. x44+16x3+ 64x? -1 

25, if 29x" = 36 

26, a Poe 160 47 


In exercises 27-32, factor the polynomial into a product of a cubic and a quadratic polynomial 
in R[x]. 


27. x — 18x? —44x? — 13x-1 
28. xP tx?+x741 

29, x —xt4+x3+2x7-—2x42 
30; PSO a= 1 SS 
31s PHS Hse ee 
32. © =41 457° +43 


Rational Values of Trigonometric Functions 


In this chapter, we apply the Rational Root Test along with Mathematical Induction to prove 
some surprising results about the values of trigonometric functions. Virtually every 
trigonometry student has used the Pythagorean Theorem to examine 30°—60°—90° and 

45° —45°—90° triangles. In fact, most students are expected to derive, or at least memorize, the 
following table: 


0 0° 30° 45° 60° 90° 
sin(6) 0 : 2 2 1 
cos(@) 1 a3 2 5 0 
tan (0) 0 af 1 J3 undefined 


Observe that with the obvious exception of tan(90°), every entry on the table is either a 
rational number or the square root of a rational number. Using formulas like the double-angle 
and triple-angle formulas, it is not hard to compute the values of the trigonometric functions at 
various other angles in the first quadrant. For example, cos(36°) = Uw and cos(20°) is the 
unique positive root of the polynomial 8x* — 6x — 1. Note that neither cos(36°) nor cos(20°) is 
rational. Furthermore, neither cos(36°) nor cos(20°) is the square root of a rational number. 


At this point, it is natural to wonder which values of the sine, cosine, and tangent functions are 
rational. Or which values, at the very least, are the square root of a rational number. Since the 
sine and cosine functions are continuous everywhere, the Intermediate Value Theorem tells us 
that they both assume all values in the interval [—1, 1]. For example, if 5, aES and = are your 
favorite numbers in the interval [0, 1], then there exists angles 61, 62, 03 in the first quadrant 
such that 


ne 2 tee a af 5s 
cos(6;) = sin(90° — 6;) = 3° cos(@2) = sin(90° — 63) = eh 
and cos(63) = sin(90° — 63) = = 


Copyright © 2010 by Elsevier Inc. All rights reserved. 423 


424 Chapter 11 


Similarly, when plugging in all the angles between —90° and 90°, the tangent function takes 
on all real values. Thus, if 3, /11, and e+ z are your favorite positive numbers, then there 
exist angles 64, 65, 06 in the first quadrant such that 


tan(04) =3, tan(@5) = Jil, and tan(06) =e+7. 


In light of the preceding observations, it might seem curious that in trigonometry courses we 
don’t use techniques like the Pythagorean Theorem along with the formulas for cos(n@), where 
n €N, to find the angle in the first quadrant such that the cosine is equal to . Similarly, why 


don’t we find the angles in the first quadrant such that the sine is equal to - or the tangent is 
equal to 3? 


As we shall soon see, in all three of these cases, the degree measure of the appropriate angle is 
not a rational number. Observe that in a first course in trigonometry, the only angles for which 
we have the tools required to compute exact values of trigonometric functions are angles 
where the degree measure is a rational number. Therefore, there would be no way in a basic 
trigonometry course for us to compute the degree measure of the angles that makes the cosine 
equal to %, the sine equal to 2, or the tangent equal to 3. 


11.1 Values of Trigonometric Functions 


The main results of this chapter will assert that the table we saw earlier in this chapter provides 
us with essentially the only cases where the degree measure of an angle is rational and the 
value of a trigonometric function is rational or the square root of a rational number. This 
explains why the 30°—60°—90° and 45°—45°—90° triangles tend to be the only right triangles 
studied in trigonometry courses. 


We now make the standard transition to radian measure. Recall that the radian measure of the 
five angles in the table presented earlier are 0, Z, 7, ae 5- In fact, it is easy to see that the 
degree measure of an angle is a rational number if and only if its radian measure is of the form 
az, where a € Q. Using radian measure, we can now state the main result of this chapter. 


Theorem 11.1. [fa € Q such that cos?(az) € Q, then cos(am) belongs to the set 
(0, +4, 422,43, +1}. 


Before proving Theorem 11.1, let us make several observations. First, Theorem 11.1 asserts 
that angles in the four quadrants with reference angles of 30°, 45°, and 60° along with angles 
that are integer multiples of 90° are the only angles whose degree measure is a rational number 
and the value of the cosine is either rational or the square root of a rational number. Next, one 
would expect a result analogous to Theorem 11.1 to also hold for the sine. This is indeed the 
case. But there is a surprising lack of symmetry. Whereas we will first prove Theorem 11.1 
and then use it to obtain a result about the sine, it does not appear that we can reverse the order 


Rational Values of Trigonometric Functions 425 


and first prove a result about the sine. However, once we do prove Theorem 11.1, we will 
immediately verify 


Corollary 11.2. If € Q such that sin? (am) € Q, then sin(az) belongs to the set 
(0, +4, $22,438, +1}. 


After proving Corollary 11.2, we will apply Theorem 11.1 to the tangent function to obtain 


Corollary 11.3. Jf a € Q such that tan?(az) € Q, then tan(az) belongs to the set 
{0, +43, +1, +V3}. 


At first glance, it is unclear how the Rational Root Test can be used to obtain information 
about values of trigonometric functions. Therefore, before presenting all the details behind the 
proof of Theorem 11.1, we proceed with 


Intuition behind the proof of Theorem 11.1. To gain some insight into this problem, let us 
suppose that both a and cos(az) are rational numbers. We want to understand how the 
Rational Root Test helps us in determining the possible values of cos(az). Since a € Q, when 
we consider the sequence 


am, 207, an, Bax, a, tithe 


there will be terms that differ by an integer multiple of 27. This tells us that there exist positive 
integers n > m such that cos(2” az) = cos(2” a7). 


Next, recall that the double-angle formula for the cosine asserts that 
cos(26) = 2.cos?(6) — 1. 


Therefore, if we let f(x) = 2x* —1 and also let f”(x) denote the composition of n copies of the 
function f(x), then we have 


cos(26) = f(cos(@)), 
cos(46) = f(cos(26)) = f° (cos(@)), 


and, more generally, 
cos(2'0) = f'(cos(6)), 
for all te N. 


Therefore, if we let c = cos(2’”’6) and also let t = n — m, then we have 


c =cos(2” az) = cos(2”" az) = f'(cos(2”az)) = f'(c). 


426 Chapter 11 


The preceding equation indicates that c is a root of the polynomial f‘(x) — x. On the other 
hand, since cos(az) € Q, it also follows that c= f’ (cos(az)) € Q. 


Thus, c is a rational root of a polynomial with integer coefficients. We can now apply the 
Rational Root Test to greatly limit the possible values of c. Once this is done, we can then 
determine the possible values of cos(az). In the proof of Theorem 11.1, instead of using the 
formula for cos(26), we will actually use a formula for 2 cos(26). This minor change will allow 
us to deal with monic polynomials, and this will make our use of the Rational Root Test even 
more efficient. 


Similar reasoning will allow us to show that if « and cos?(az:) are both rational, then 
2cos*(azr) must belong to the set {0, 5. 1, 3, 2}. The remainder of the proof of Theorem 11.1 


will then consist of some simple computations. 


We can now start down the path that will lead us to the proof of Theorem 11.1. Many of the 
steps along the way will make heavy use of Mathematical Induction. 


Lemma 11.4. Let bo be a rational number in the interval |—2, 2] and use it to define a 
sequence by letting by, = 1 ene for alln > 0. Then 


(a) by, is a rational number in the interval [—2, 2], for all n > 0; 
(b) if by = bm, where n > m, then by, belongs to the set {+1, +2}; 


(c) if there exists some m > 0 such that bm belongs to the set {0, +1, +2}, then bo also 
belongs to the set {0, +1, +2}. 


Proof. Part (a) will follow using a fairly simple Mathematical Induction argument. First, the 
result clearly holds when n = 0. Next, suppose k > O such that b, has the desired properties. 
We need to show that b;,, also has the desired properties. Since b, € Q, it follows that 

bya = b,? —2 is also rational. Next, since b, € [—2, 2], we see that b,” € [0, 4], so 

bya = by? —2 € [—2, 2]. Thus, bx+1 does inherit the desired properties from b;, thereby 
concluding the proof of (a). 


For part (b), let f(x) = x* —2 and let f (x) denote the composition of / copies of f(x), for all 

1 éN. Observe that b; = f'(bo), for all/ > 1 and if t=n—m, then b, = f'(bm). We now claim 
that if / > 2, then f!(x) is a monic polynomial of degree 2’ with a constant term equal to 2. 
First, observe that 


£20) = FU) = fo? —2) = 62 -— 2)? -2 = (x4 — 4x? 4.4) —2 = x? — 4x? 42, 


thereby settling the / = 2 case. 


Rational Values of Trigonometric Functions 427 
Next, suppose that k > 2 such that f*(x) has the desired properties. Then 
Feo — f(f*() = a + lower-degree terms +2) = 


(x2 + lower-degree terms + a) -2= («? + lower-degree terms + 4) —2= 


k+1 
ae lower-degree terms + 2. 


Therefore, f*+!(x) inherits the desired properties from f k(x). Thus, for all / > 2, f!(x) is 
indeed monic of degree 2' with a constant term of 2. 


Now, let c = bm and also let tf =n —™m. Since by = bm, we have 
C= bm = bn = fi (bm) = fi (0). 
Therefore, c is a root of the polynomial f‘(x) — x. 


From part (a), c is a rational number in the interval [—2, 2]. Using our preceding argument, if 
t > 2, then f'(x) —x is monic with constant term 2. On the other hand, if t = 1, then f‘(x) —x 
is monic with constant term —2. Regardless of which case we are in, since c is a root of 

f'(x) — x, we can apply the Rational Root Test to assert that c must be an integer that divides 
2. Hence, by = c € {+1, +2}, concluding the proof of (b). 


For part (c), we will proceed using a proof by contradiction. If bo ¢ {0, £1, £2}, we can let k 
be the largest integer such that 0 < k <m-—1 and b, ¢ {0, £1, +2}. Observe that if k <m—1, 
then our choice of k guarantees that by, € {0, £1, +2}. On the other hand, if k =m — 1, then 
k+1=m and bg41 = bm € {41, +2}. Therefore, in both cases, bg41 € {0, £1, +2}. Since 

b;? —2 = by4, € {—2, —1,0, 1, 2}, we immediately see that 


by? € {0, 1, 2,3, 4}. 
Using the fact that b; € Q, it now follows that 


by € {0, £1, +2}, 


which contradicts our choice of k. Thus, bo € {0, +1, +2}, as desired. 


The most important piece of information obtained from Lemma 11.4 is that if the sequence 
bo, bi, b2, b3,... ever repeats a term, then the initial term bg must belong to {0, +1, +2}. We 
now show how the sequence in Lemma 11.4 is related to the double-angle formula for the 
cosine. 


Lemma 11.5. /fcos(@) € Q, let by = 2.cos(9) in the sequence defined in Lemma 11.4. Then 
by = 2c0s(2"8), for all n > 0. 


428 Chapter 11 


Proof. As you probably expected, we will prove this using Mathematical Induction. The n = 0 
case is clear, so let us now suppose that k > 0 has the property that b, = 2. cos(2"6). We need to 
show that by.) = 2cos(2‘+!6), 


If we apply the double-angle formula to the angle 2*6, we obtain 
byt = by? —2 = (2c0s(2*6))” —2 = 4 cos?(2*4) —2 = 
2(2 cos” (26) — 1) = 2(cos(2-2*@)) = 2cos(2**!9), 


as desired. 


We now have all the pieces we need to prove Theorem 11.1. 
Proof of Theorem 11.1. Suppose both a and cos*(az) are rational. Let @ = 2a; then the 
double-angle formula for the cosine tells us that 

cos(@) = cos(2- asm) = 2cos*(am) -—1€Q. 


Since cos(@) € Q, we can let bp = 2.cos(@) and use bo to generate the sequence described in 
Lemma 11.4. 


We know that since a € Q, we can write a = a where a, b € Z and b > O. Next, consider the 
sequence 


a, 2a, a: Ba, Va, ined 


When we divide integers by b, there are only b possible remainders. Therefore, there exist 
terms in the preceding sequence that yield the same remainder when divided by Db. As a result, 
there exist integers n > m > 0 such that 2”a and 2a have the same remainder when divided 
by b. Therefore, we can write 


2"a=2”a+sb, 
where s € Z. 
Doubling this equation results in 
2” (2a) = 2" (2a) + 2sb. 


Dividing by b yields 


Then multiplying by z gives us 


Rational Values of Trigonometric Functions 429 


Since 2” (447) and 2” (447) differ by an integer multiple of 277, they give us the same value 
when plugged into the cosine. Combining this with the facts that 6 = 2az and a = ¢, we obtain 


2 
cos(2"6) = cos(2” - 2a7r) = cos (2 . in) = 


2 
cos (2" . on) = cos(2” - 2a7) = cos(2'"6). 
It now follows from Lemma 11.5 and the previous equation that 
by = 2c0s(2"9) = 2cos(2""0) = by, 


As a result, the sequence defined in Lemma 11.4 has repetitions and part (c) of the lemma tells 
us that bo € {0, +1, +2}. The double-angle formula now tells us that 


bo = 2. cos(@) = 2.cos(2- am) = 2(2cos > (em) —1). 
Thus, 
2(2.cos?(az) — 1) € {-2, -1, 0, 1, 2}. 


A series of simple calculations now yields 


2cos*(ar) le {1 G24 
a = ~T41, TU Os ’ 
uae 


and finally, 


cos(a7z) € (fo. +-,+—,+ 


thereby concluding the proof. 


It is now an easy task to use Theorem 11.1 to prove Corollary 11.2. 


Proof of Corollary 11.2. Suppose both a and sin? (ast) are rational. In light of the identity 
sin? (0) + cos?(6) = 1, it follows that cos*(a7) is also rational. Theorem 11.1 now tells us that 


v2 33 ai]. 


0, +5 oa 
cos(@7z) € | 7 ae 


430 Chapter 11 
Thus, 
cos? (art) € {0 a 4 is i} 
424 
and the identity sin’ (6) + cos?(6) = 1 implies that 
113 
sin’ (az) € {0 Ro i} 


It now immediately follows that 


1 
inten) «fe yx 


Using Theorem 11.1 to prove Corollary 11.3 is also a fairly straightforward task. 


Proof of Corollary 11.3. Suppose both a and tan*(azr) are both rational. Using the identity 
sec?(6) = tan?(6) + 1, we see that sec?(az) is also rational. Since the secant function is the 
reciprocal of the cosine function, we know that cos (az) is rational. 


Observe that since tan?(az) is rational, cos(az) cannot be equal to 0. Therefore, in our present 
situation, Theorem 11.1 tells us that 


cnt) «| 5 ee nie si] 


ort 2s 


A short series of calculations tells us that 
2( siz 1 1 3 1 
cos ae oh fs 
le ad We aca 
sec’ (a7) € {4 2, 


sec? (ar) —le {3. 1, 
Using the fact that tan?(@) = sec*(@) — 1, we see that 
> 1 
tan“(az) € 73,1, as : 
which leads to 


tan(az) = fo. 1. av3}, 


as desired. 


Rational Values of Trigonometric Functions 431 


At this point, we need a little bit of terminology in order to make some additional observations 
about values of trigonometric functions. Real numbers that are the root of a nonconstant 
polynomial with integer coefficients are known as algebraic. Certainly every rational number 
is algebraic. To see this, observe that if a € Q, then we can write a = re where a € Z and 

b EN. Then a@ is a root of the polynomial bx — a. 


There are also many irrational numbers that are algebraic. As we have seen, numbers such as 
1 ete : 1s, 3, 
22,53, and 73 are all irrational. Yet 22 is a root of x* —2, 53 is a root of x*> —5, and 7# isa 
1 1 3 , ri : 
root of x+— 243. Thus, 22,53, and 7? are all algebraic. More complicated-looking numbers 


1 

6\ 3 

like /3 + 5 and (2-4) are also algebraic. By doing a series of computations, it is not 
192 

too hard to find nonconstant polynomials that have these numbers as roots. 


Real numbers that are not algebraic are known as transcendental. Although transcendental 
numbers abound, it is often quite difficult to determine whether a particular number is 


transcendental. It turns out that both e and z are transcendental. Whereas it takes a good deal 
of work to verify that e is transcendental, it takes even more to do the same for zr. 


In light of Theorem 11.1, we know that cos(1°), sin(40°), and tan(83°) are all irrational. 
However, we will soon show that all three are algebraic. More precisely, after returning to 
radian measure, we will see that if w € Q such that the value of one of the six trigonometric 
functions exists when plugging in az, then that value must be algebraic. We will begin by 
looking at the cosine and sine. The idea will be to use DeMoivre’s Theorem to find a 
polynomial that has both cos(q@z) and sin(az) as roots. 


Theorem 11.6. fae Q, then both cos(az) and sin(az) are both algebraic. 


Proof. Since a € Q, we can write a = a where a € Z and b € N. DeMoivre’s Theorem tells 
us that 


(cos(9) +isin (0) = cos(4b6) +isin(4b6). 


When we use the binomial theorem to expand the left-hand side of the equation and then 
compare the real parts of both sides of the equation, we obtain 


cos*? (6) — (") cos*?—? (6) sin? (6) + (“’) cos*?—4(6) sint(@) —--- 


Ab 
= ie ,) cos*(6) sin*?-? (6) + sin?” (6) = cos(4b8). 


On the left-hand side of the equation, we can use the identity sin? (6) +cos*(@) = 1 to replace 
every term of the form sin” (6) by A - cos?(6))!. This will result in the left-hand side 


432 Chapter 11 


becoming a sum of terms of the form ccos~'(@), where c € Z and t > 0. As a result, there exists 
a nonconstant polynomial f(x) € Z[x], in which x only appears with even exponents, such that 


f(cos(@)) = cos(4b8). 
If we let 6 = az in the previous equation, then we see that 
f(cos(a)) = cos(4b- az) = cos (40. =n) = cos(4am) = cos(2a- 27) = 1. 
Therefore, cos(a7zr) is a root of the polynomial f(x) — 1, so cos(az) is algebraic. 


It now suffices to show that sin(az) is also a root of f(x) — 1. Since Z —az and az add up to 


5, it follows that cos(Z — az) = sin(az). Therefore, if we let @ = 5 — az in the equation 


f(cos(@)) = cos(4b6), we obtain 
f(sin(az)) = F(cos (5 — an) = cos (40(5 — an) = 


cos(40(> a =m) — cos((b —2a)2x) = 1. 


Thus, f(sin(az)) — 1 = 0, so sin(az) is also aroot of f(x) —1. 


In order to examine the other trigonometric functions, we need a lemma that essentially says 
that if 8 is a nonzero algebraic number, then B~! is also algebraic. 


Lemma 11.7. Suppose B, ao, a1, ..-,; An—1, An belong to a field such that B is a nonzero root of 
the polynomial a,x" + ay_,x"~! +--+ +a,x-+ag. Then B~' is a root of the polynomial 
gx" +x"! +--+ + dy—1X +p. 


Proof. Plugging into the polynomial a,x" + dy—1x"~!+-++-+a,x+<ap results in 
An B" +n—1B" | +++» +a;B+ay =0. 
Multiplying both sides of this equation by (gy. yields 


dy, + dy +--+, (B71) +49 (8-1)" =0. 


However, it is easy to see that the preceding equation says that B~! is indeed a root of 
Ax” + ayx" | +--+ + ap 1X + an. 


We can now extend Theorem 11.6 to the other four trigonometric functions. When reading the 
statement of this result, remember that whereas the cosine and sine function are defined for all 
real numbers, there exist w € Q such that either sec(az), csc(az), tan(az), or cot(a7) may 
fail to be defined. 


Rational Values of Trigonometric Functions 433 


Theorem 11.8. Jf cos(az), sin(az), tan(az), sec(az), csc(am), or cot(az) is defined, where 
a €Q, then it must be algebraic. 


Proof. Since a € Q, Theorem 11.6 asserts that both cos(az) and sin(az) are algebraic. 
Recall that the secant and the cosecant are the reciprocals of the cosine and sine. Therefore, 
Lemma 11.7 tells us that if sec(azr) or csc(e@z) is defined, then it must be algebraic. 


Next, if cot(a@z) is defined, then either cot(az) = 0 or cot(az) = a In the first case, 


cot(azr) is clearly algebraic. Therefore, in light of Lemma 11.7, to conclude the proof, it 
suffices to show that if tan(azr) is defined, then it is algebraic. At this point, we need to refer 
back to the proof of Theorem 11.6 and look at the nonconstant polynomial f(x) € Z[x] that 
had the property that cos(wzr) was a root of f(x) — 1. The idea behind this proof is to modify 
f(x) — 1 to find a nonconstant element of Z[x] that has tan(@z) as a root. 


Since tan(azz) is defined, we know that cos(az) 4 0. Therefore, sec(az) = SONG) is also 
defined. Lemma 11.7 allows us to modify the polynomial f(x) — 1 to obtain a polynomial g(x) 
which has sec(azr) as a root. Recall that whenever x appeared in f(x), it had an even exponent. 
It is easy to see that f(x) — 1 has this property and the proof of Lemma 11.7 indicates that this 
property is also inherited by g(x). Since all the exponents of x in g(x) are even, there exists a 


nonconstant h(x) € Z[x] such that h(x”) = g(x). Therefore, 
h(sec?(az)) = g(sec(am)) = 0. 


Now, let M(x) = h(x” +1). Clearly, M(x) is a nonconstant polynomial with integer 
coefficient. Using the identity sec?(9) = tan?(6) + 1, we now have 


M(tan(aszt)) = h(tan?(as) + 1) = h(sec”(az)) = 0. 


Thus, tan(azr) is a root of M(x), hence tan(azz) is algebraic, thereby concluding the proof. 


Exercises for Section 11.1 


Many of the exercises in this chapter will use DeMoivre’s Theorem. You will frequently be 
required to find polynomials in Z[x] that, for various values of 6, have either cos(@) or sin(@) 
as a root. 


1. Derive the formula cos(36) = 4cos?(6) — 3 cos(@). 

2. Derive the formula sin(30) = —4 sin? (0) +3 sin(6). 

3. Use exercise 1 to show that cos(20°) is a root of the polynomial 8x° — 6x — 1. 
4 


Show that 8x° — 6x —1 is irreducible in Q[x] and explain why it has three distinct roots 
in C. 


434 Chapter 11 


5. Use exercise 1 to show that cos(100°) and cos(140°) are the other two roots of 
8x? —6x—1. 


6. Show that the polynomial 8x* — 6x +1 is also irreducible in Q[x] and has three distinct 
roots in C. 


7. Use exercise 1 to show that cos(40°), cos(80°), and cos(160°) are the three roots of 
8x° —6x+1. 


8. Use exercise 2 to show that sin(10°), sin(50°), and sin(250°) are also the three roots of 
8x° — 6x-+ 1. Pair up these roots with the three roots you found in exercise 7. 


For exercises 9-13, please read the following: 


In Chapter 11, we indicated that there was a surprising lack of symmetry between the sine and 
cosine functions as the main results of Chapter 11 could be obtained by using the cosine but 
not the sine. However, a slightly weaker version of Theorem 11.1 can be obtained using the 
sine function. In particular, we can prove that the only rational values of the sine function that 
occur when plugging in angles of the form wz, where a € Q, must belong to the set 

{0, =e +1}. We will work through the details of this proof in the next five exercises. To begin, 
let bo be a rational number in the interval [—2, 2] and then use it to define a sequence by 
letting by41 = —by? +3bn, forn > 0. 


9. Show that, for every n > 0, b, is a rational number in the interval [—2, 2]. 
10. Show that if n > m and b, = bm, then by, € {0, £1, £2}. 
11. Show that if b,, € {0, +1, £2}, for some m > 0, then bo € {0, +1, +2}. 
12. Show that if bp = 2 sin(6), then b, = 2 sin(3”6), for all n > 0. 
13. Show that if both a and sin(a@z) belong to Q, then sin(az) € {0, +5, le 
For exercises 14—20, please read the following: 


In this next set of exercises, we will compute the exact values of cos(36°) and cos(72°). In 
light of Theorem 11.1, neither of these values of the cosine is rational, nor is it the square root 
of a rational number. However, as we will soon see, they are both roots of quadratic 
polynomials in Z[x]. 


14. Derive the formula 2 cos(4) = (2cos(0))* — 4(2 cos(6))* + 2. 
15. Show that 2.cos(36°) is a root of the polynomial x* — 4x? + x42. 
16. Use the Rational Root Test to find all rational roots of x+ — 4x7 +.x+2. 


17. Find a monic, quadratic polynomial with integer coefficients which has 2 cos(36°) as a 
root. 


Rational Values of Trigonometric Functions 435 


18. Find the exact value of cos(36°). 

19. Show that 2cos(108°) is the other root of the polynomial in exercise 17. 
20. Find the exact value of cos(72°). 

For exercises 21—25, please read the following: 


In this set of exercises, we will compute the exact values of cos(18°) and cos(54°). As we 
shall see, these values of the cosine satisfy polynomials in Q[x] of degree 4 but do not satisfy 
any nonconstant polynomial of degree less than 4. 


21. Derive the formula 2 cos(56) = (2 cos(6))> — 5(2cos(6))? +5(2cos(@)). 
22. Show that 2cos(18°) is a root of xt —5x?4+5. 


23. Show that the four roots of x —5x?+5 are 2 cos(18°), 2cos(54°), 2cos(126°), 
2cos(162°). 


24. Find the exact value of cos(18°) and the exact value of cos(54°). 


25. Find a monic irreducible polynomial in Q[x] which has cos(18°) and cos(54°) as two of 
its roots. 


For exercises 26-30, please read the following: 


Next, we will compute the exact values of cos(22.5°) and cos(67.5°). These values of the 
cosine will also satisfy polynomials in Q[x] of degree 4 but satisfy no nonconstant polynomial 
of degree less than 4. 


26. Use the formula in exercise 14 to find a monic polynomial of degree 4 in Z[x] that has 
2. cos(22.5°) as a root. 


27. Show that the polynomial in exercise 26 is irreducible over Q. 


28. Show the other three roots of the polynomial obtained in exercise 26 are 2cos(67.5°), 
2cos(112.5°), and 2cos(157.5°). 


29. Find the exact values of cos(22.5°) and cos(67.5°). 


30. Find a monic irreducible polynomial in Q[x] that has cos(22.5°) and cos(67.5°) as two 
of its roots. 


31. Ifn €N is even, show that there exists a polynomial of degree n in Q[x] that has cos(+) 
as a root. 


32. Ifn > 3 is odd, show that there exists a polynomial of degree n — 1 in Q[x] that has 
Ls 


x) as a root. 


cos( 


436 Chapter 11 


33; 


34. 


35, 


36. 


If p is an odd prime, examine the polynomial obtained in exercise 32, and then show that 
there exists an irreducible polynomial of degree p — 1 in Q[x] that has cos ( as 
a root. 


Show that the polynomial in exercise 33 has p— 1 distinct roots and these roots are all 


4-1) r), where i takes on all integer values from 1 to p except for 


of the form cos (( 
: +1 

— o>. 

Use exercise 34 to find the smallest degree of a nonconstant polynomial over Q that has 


cos(4) as a root, and then find all the roots of this polynomial. 


Use exercise 34 to find the smallest degree of a nonconstant polynomial over Q that has 


cos(4) as a root, and then find all the roots of this polynomial. 


Polynomials over Arbitrary Fields 


To this point, we have looked at various examples of commutative rings. In Chapter 3, we 
took an in-depth look at our first example, the integers. While studying Z, we examined four 
concepts of particular importance: prime numbers, the division algorithm, the Euclidean 
Algorithm, and the existence and uniqueness of prime factorization. In many commutative 
rings, analogs of these concepts do not exist in any recognizable form. However, it may come 
as a pleasant surprise that for any field F, similar concepts exist in F'[x]. In fact, when 
studying F'[x], many of the proofs and ideas will be almost identical to the ones you saw when 
we studied Z. Since F[x] is a more abstract object, as you work through the proofs in this 
chapter, it might be worthwhile to go back and review some of the proofs in Chapter 3. 


There are many important facts about polynomials that are stated without proof in high school 
algebra courses. In this chapter, we will prove several of these, such as 


Gi) if f(x) € F[x] andifa e€ F, then @ is a root of f(x) if and only if x —q@ is a factor of f(x); 


Gi) if f(x) € R[x], then f(x) can be written as a product of linear polynomials and 
irreducible quadratics in R[x]; 


(iii) if f(x) € F[x] has degree n, then f(x) has at most n roots in F. 


12.1 Similarities between Polynomials and Integers 


We now start down a path that will reveal remarkable similarities between Z and F[x] by 
looking at divisibility. Throughout this chapter, F will always be a field, and F'[x] will always 
be the set of polynomials with coefficients if F. In many of the proofs in this chapter, we will 
use notation similar to that used in Chapter 3. Hopefully, this should make the similarities 
between Z and F[x] even more apparent. 


Definition 12.1. Given polynomials a(x), b(x) € F[x] (with a(x) £0), we say that a(x) 
divides b(x), written a(x) | b(x), if there exists a polynomial m(x) € F[x] such that 
b(x) = a(x)-m(x). In this case, we also say that a(x) is a divisor of b(x). 


If a(x) is not a divisor of b(x), we write a(x) ¥ b(x). 


Copyright © 2010 by Elsevier Inc. All rights reserved. 43 7 


438 Chapter 12 
m= Examples 


We will now look at examples of divisibility in Q[x], R[x], C[x], and Zs[x]. 
1. In Q[x], 


(7 +1) | (8 -—5x?+x-5) as x29 —5x?4+x—-5= (2° +1) (x—-5) 
and 
(2x+7) | (6x°+29x7+26x—7) as 6x° +29x"+26x—7 = (2x +7) (3x7 +4x—1). 
2. In R[x, 
(x— V2) | (x?7-2) as x°-2=(@—V2)(x4+ V2) 
and 
(3x7 +5) | (21x*+11x7—40) as 21x*+ 11x? —40 = (3x? +5) (7x -8). 
3. InC{x], 
(x+i) |? +1) as P+1=(+)@-/), 
(x+5) | (x*-2x—-35) as x°-2x-35=(x+5)(x—-7), 
and 
(x+2+2) | (2x°+9x7+14x+5) as 2x°49x7° 4+ 14x+5= 
(x+2+i)(x+2—i)(2x+1) = (x +24/ (2x7 +(5—2i)x+(2—-d). 
4. In Zs{x], 
(x+([2]5) | (x° +05) as x°+[1]5 = (+[2]s)(@+ [3]s) 
and 
(x+[3]s) | ( +13]s) as x°+13]s = (+I3]s)°. 
Note that if b(x) = a(x)-m(x) € F[x] then, for any nonzero a € F, we have 
b(x) = (aa(x)) «(a7 !m(x)). 


This tells us if a(x) is a divisor of b(x), then so is aa(x), for every nonzero a € F. 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 439 


We can also list many polynomials a(x), b(x) where a(x) is not a divisor of b(x). For example, 
in Q[x], it is not hard to see that 


(x+1) (x42), (0° +3) (x? +5x+2), and x J(x*-1). 


Let us now consider the following question: Is 5 a divisor of 2? Certainly one is tempted to 
immediately answer no. This is certainly the correct answer if we are working in Z. On the 
other hand, suppose we are working in Q[x]. In this situation, 2 = 5- z, so 5 is a divisor of 2 in 
Q[x]. One of the differences between working in Z and working in Q[x] is that 5 has a 
multiplicative inverse in Q[x] but not in Z. Observe that in Z the only elements with 
multiplicative inverses are +1, whereas in Q[x] the elements with multiplicative inverses are 
precisely the nonzero elements of Q. 


The preceding example indicates, once again, that the answers to many questions in algebra 
often depend on the context. Previously we had observed that the question of whether a 
polynomial is irreducible also depends on the context. For example, x” — 2 is irreducible in 
Q[x] but is reducible in R[x]. Since enlarging a field can certainly affect whether a polynomial 
is irreducible, it is natural to wonder if enlarging a field can affect whether one polynomial is a 
divisor of the other. To investigate this question, suppose F' C K are fields and suppose 

a(x), b(x) belong to F[x]. In one direction, it is not hard to see that if a(x) is a divisor of b(x) 
in F[x], then a(x) is also a divisor of b(x) in K[x]. To see this, observe that there exists some 
m(x) € F[x] such that b(x) = a(x) -m(x). But since m(x) also belongs to K[x], it is now clear 
that a(x) is also a divisor of b(x) in K[x]. The other direction is the harder one. If a(x) is a 
divisor of b(x) in K[x], we need to determine if a(x) is also a divisor of b(x) in F[x]. The 
answer to this question will appear later in this chapter as one of the applications of the 
division algorithm. 


Since every nonzero element of F has a multiplicative inverse in F', every nonzero element 
of F is a divisor of every polynomial in Fx]. To see this, observe that if f(x) € F[x] and if a 
is a nonzero element of F, then f(x) = a- (a7! f(x)). We should also note that, similar to 

the situation in Z, every nonzero polynomial in F[x] is a divisor of 0. 


Whereas the building blocks of the integers are the prime numbers, the analogous role in F[x] 
will be played by the irreducible polynomials. The next definition is essentially a restatement 
of Definition 9.8, and, for convenience, we include it here as 


Definition 12.2. Let f(x) € F[x] be a polynomial of degree at least 1. We say that f(x) is 
reducible in F [x] if there exist g(x), h(x) € F[x], both with smaller degree than f(x), such 
that f(x) = g(x)-h(x). If f(x) € F[x] has degree at least I and is not reducible in F(x], 
then we say that f(x) is irreducible in F [x]. 


Recall that to prove the uniqueness of prime factorization in Z, it was important that we not 
consider 1 to be a prime number. For virtually identical reasons, we do not consider 


440 Chapter 12 


polynomials to be irreducible in F[x] unless their degree is at least 1. To see this, observe that 
in Q[x], if we considered the degree 0 polynomial g(x) = 2 to be irreducible, then polynomials 
could not be factored uniquely. For example, we could factor the polynomial f(x) = x in many 
different ways as 


Therefore, in order to prove a unique factorization theorem in F[x], we will not consider 
polynomials of degree 0 to be irreducible. 


As pointed out in Chapter 9, all polynomials of degree 1 in F[x] are irreducible. In light of 
this, we can factor 6x? — 6 into a product of two irreducible polynomials in Q[x] in many 
different ways. For example, we have 


6x* — 6 = (x+ 1)(6x—6) = (2x +2)(3x—3) = Gx+3)(2x—2) = 
(6x+6)(x-1)= (5+ 5) (12x — 12) = (60x + 60) (G- a): 


This example indicates that in order to prove a unique factorization theorem, we will need the 
following concept. 


Definition 12.3. We say that f(x) € F[x] is monic if its leading coefficient is 1. 


Returning to the polynomial 6x” — 6, observe that the only monic polynomials of degree 1 in 
Q[x] which are divisors of 6x” — 6 are x + 1 and x — 1. We now have 


6x* —6=6-(x+1)-(x—1). 


Note that, except for juggling the order of 6, x + 1, and x — 1, there appears to be no other way 
to factor 6x” — 6 in Q[x] as the product of an element of Q and monic, irreducible polynomials 
in Q[x]. This indicates that the only irreducible polynomials a unique factorization theorem 
should refer to are the monic ones. We can now state the main result of this chapter. 


Theorem 12.4—Unique Factorization Theorem. Every polynomial f(x) € F[x] of degree 
at least one can be written uniquely (up to order) as a product of an element of F and monic, 
irreducible polynomials in F [x]. 


As was the case for Theorem 3.3, this theorem has two parts. First, we need to show that every 
(x) € F[x] of degree at least 1 can be written as a product of an element of F and monic, 
irreducible polynomials. That will be the easier task, and it will be accomplished by using a 
proof by contradiction along with the Well Ordering Principle. When a polynomial is written 
as a product of an element of F and monic, irreducible polynomials from F'[x], we say that it 
has been completely factored. Looking back at the proof of the second part of Theorem 3.3, it 


Polynomials over Arbitrary Fields 441 


should come as no surprise that the proof of the uniqueness part of Theorem 12.4 will require 
some additional mathematical machinery. 


m Examples 


We will examine the factorization of f(x) = 7x4 —7x* — 140 in Q[x], R[x], and C[x]. 
1. In Q[x], we begin by factoring out the leading coefficient to obtain 


f(x) = 7(x4 — x? — 20). 


Next, we need to factor x* — x” — 20 into monic, irreducible polynomials in Q[x]. The 
Rational Root Test shows that x+— x? — 20 has no rational roots, so x* — x* — 20 has 
no factor in Q[x] of degree 1. However, it is not hard to see that we can factor 


x* — x* — 20 into two quadratics in Q[x], giving us 


f(x) = 7(x* —5) (x7 +4). 


Since x7 —5 and x7 +4 are both irreducible in Q[x], we have now completely 
factored f(x) in Q[x]. 


2. In R[x], since Q CR, we go back to the factorization of f(x) in Q[x] and examine 
whether the factors x7 —5 and x7 +4 remain irreducible in R[x]. It easy to see that in 
R[x], x2 —5 has monic, irreducible factors x — /5 and x+ V5. It is also easy to see 
that x7 +4 remains irreducible in R[x]. Therefore, the complete factorization of f(x) 
in R[x] is 


f@) = Ta —V5) (0 +V5) 7? +4). 


3. In C[x], since R CC, we go back to the factorization of f(x) in R[x] and examine 
whether the factor x7 +4 remains irreducible in R[x]. However, +2i are roots in C of 
x? +4, so x7 +4 has monic, irreducible factors x — 2i and x +2i in C[x]. Therefore, 
the complete factorization of f(x) in C[x] is 


f(x) = 7Te— V5) (xt V5) (x — 21) (x +:2i). 


As this example indicates, by enlarging the field, the factorization of f(x) can change. 
But this does not contradict the uniqueness portion of Theorem 12.4. Note that 
Theorem 12.4 asserts that the factorization of f(x) is unique provided the field 
remains fixed. Thus, the only way to factor 1x =7x" — 140 in Olx) is. as 

7(x* —5)(x? +4), the only way to factor 7x* —7x* — 140 in R[x] is as 

T(x —J/5)(x + V/5)(x? +4), and the only way to factor 7x4 —7x* — 140 in C[x] is as 
T(x — V5) (x + V5) (x — 21) (x +23). 


www.elsevierdirect.com 


442 Chapter 12 


4. For another example, consider the polynomial g(x) = [2]3x*+[1]3 in Zs[x]. This 
polynomial can be factored as 


e(x) = [2]3(¢ +[1]3)(a + [2]s) (x? + [1]s). 


Since x7 +[1]3 isa quadratic with no roots in Zs, it is irreducible in Z3[x]. Therefore, 
x+[1]3, x +[2]3, and x2 +[1]3 are all monic and irreducible in Z3[x]. Thus, the preceding 
factorization is the complete factorization of g(x) = [2]3x* +[1]3 in Z[x]. 


When completely factoring a polynomial, as a matter of convenience, we often do not 
factor out the leading term. Instead, we sometimes write our polynomial as a product of 
irreducible polynomials, some of which might not be monic. For example, if asked to 
completely factor 6x* + 7x — 5 in Q[x], we are more likely to leave the answer as 

6x" + 7x —5 = (2x— 1) (3x+5) as opposed to writing 6x? + 7x —5 = 6(x — 5)(x+ 3). 


In our proof of the first part of Theorem 12.4, you should look for the similarities between this 
proof and the proof of the first part of Theorem 3.3. 


Proof of the first part of Theorem 12.4. We proceed with a proof by contradiction and begin 
by supposing that there exists a polynomial f(x) € F[x] of degree at least 1 that cannot be 
completely factored in F'[x]. The Well Ordering Principle now guarantees that there is a 
smallest positive integer m such that there exists a polynomial g(x) € F[x] of degree m that 
cannot be completely factored in F'[x]. 


If a is the leading coefficient of g(x), we can let h(x) = a! g(x). Then h(x) is a monic 
polynomial that belongs to F[x] and 


2(x) =ah(x). 


Let us now examine the nature of (x). One possibility is that A(x) is irreducible, but, in this 
case, we have completely factored g(x) in F[x] as g(x) = ah(x). In this case, we see that, 
simultaneously, g(x) can and cannot be completely factored in F[x]. This is certainly a 
contradiction, so the case of h(x) being irreducible cannot occur. 


The only remaining possibility is that h(x) is not irreducible, so we can write h(x) as a product 
of two polynomials in F'[x], both of which have smaller degree. Since h(x) is monic, there 
exist monic polynomials a(x), b(x) € F[x] such that 


h(x) =a(x)-b(x), where 1 < deg(a(x)), deg(b(x)) < m. 


Since m is the smallest positive integer such that there exists a polynomial in F'[x] of this 
degree that cannot be completely factored in F [x], we see that a(x) and b(x) can be 


Polynomials over Arbitrary Fields 443 


completely factored in F[x]. Since a(x) and b(x) are both monic, there exist monic, 
irreducible polynomials 


Pi(x), p2(x),..-, Pe(X), Qi (X), G2(X), --., i(x) € FLX] 


such that 


a(x) = pi(x): po(x)-++ pe(x) and = D(x) = qi (x) -g2(x)- ++ gi(x). 


Note that the list of polynomials p,(x), p2(x),..., De(X), gi (*), Go(X), ..-, gi(X) 1s allowed to 
have the same polynomial occurring more than once. 


Since h(x) = a(x) - b(x), we now have 
8(x) = ah(x) = a-a(x)-b(x) = 
+ (pi (X) + po(x) +++ pe) (Qi (%) + g2(x) ++ qi(x)) = 
+ Pi(X) + pr(x) ++ pe (X) qi) + g2(x) +++ q(x). 


However, the preceding equation illustrates that g(x) can be completely factored in F[x]. This 
is a contradiction, since g(x), simultaneously, can and cannot be completely factored in F[x]. 
We have now shown that the case of (x) being irreducible as well as the case of h(x) not 
being irreducible both lead to a contradiction. Therefore, we can now conclude that every 
f(x) € F[x] of degree at least 1 can be completely factored in F[x]. 


If the field F is infinite, then the set 
{x-a|ae F} 


is an infinite set of monic, irreducible polynomials in F [x]. Thus, in this case, there are 
certainly an infinite number of monic, irreducible polynomials in F[x]. But if the field F is 
finite, it is not immediately clear whether there are an infinite number of monic, irreducible 
polynomials in F'[x]. Whereas Theorem 3.4 told us that there are an infinite number of prime 
numbers, a similar argument will show that Fx] always contains an infinite number of monic, 
irreducible polynomials. 


Corollary 12.5. There are an infinite number of monic, irreducible polynomials in F(x]. 


Proof. We will proceed with a proof by contradiction and begin by supposing there are only a 
finite number of monic, irreducible polynomials in F'[x]. Then there is a finite list 


Pi(X), p2(x),-++5 Pm), 
which consists of all the monic, irreducible polynomials in F[x]. Next, let 


F(x) = p(x): p(x) +++ Pm(x) +1. 


444 Chapter 12 


By Theorem 12.4, there is a monic, irreducible polynomial g(x) € F[x] which divides f(x). 
But since g(x) belongs to the list pj (x), p2(x),..., Pm(x), it follows that g(x) also divides 
P(X): po(x)--+ Pm(x). As aresult, g(x) must divide 


f(x) — pi(x): p2(x) +++ pm(x) = 1. 


However, since the degree of q(x) is at least one, it is impossible for g(x) to divide 1, so we 
have reached a contradiction. 


As we will now see, the ideas behind the proofs of Theorem 3.4 and Corollary 12.5 have some 
further consequences. 


m Examples 


In Zo[x] the only polynomials of degree | are x and x+[1]>. If we multiply the two of 
them and then add [1]2, we obtain 


fo) = («)- «+1 b)) +b = 2? +24 [1h. 


Observe that since f(x) has degree 2, but no factor of degree 1, it follows that f(x) must 
be irreducible in Z2[x]. You should now convince yourself that f(x) is the only monic, 
irreducible polynomial in Zo[x] of degree 2. 


If we instead multiply two copies of x with x +[1]2 or two copies of x +[1]2 with x before 
adding [1]2, we obtain 


g(x) = (x?-(@+[1h)) +b = 2° 437+ [1b 
and 
h(x) = (x-@+[12)’) +2 = 9° +4401). 


Note that both g(x) and h(x) have degree 3, yet have no factors of degree 1. Thus, 9(x) 
and h(x) are both irreducible in Z2[x]. Again, you should convince yourself that g(x) and 
h(x) are the only monic, irreducible polynomials in Z2[x] of degree 3. 


12.2 Division Algorithm 


Let us recall that when the degree of a polynomial was defined in Definition 9.2, the definition 
only referred to polynomials with at least one nonzero coefficient. Therefore, we did not 
assign any degree to the polynomial f(x) = 0. Some books handle this problem by referring to 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 445 


the degree of the polynomial f(x) = 0 as —oo. However, the approach that we will take is that 
the polynomial, all of whose coefficients are zero, is not assigned a degree. 


Theorem 12.6—The Division Algorithm for F[x]. Jf a(x) is a nonzero element of F[x| and 
if f(x) € F[x], then there exist unique polynomials q(x) and r(x) in F[x] with the properties 
that 


f(x) =q(x)- a(x) +r(x) andeither deg(r(x)) < deg(a(x)) or r(x) =0. 
We call q(x) the quotient and r(x) the remainder. 


Intuition. Let us consider the existence of g(x) and r(x) in the case where we divide 

3x4 + 8x3 +x7+7x—5 by 2x? +1 in Q[x]. If you have ever performed long division with 
polynomials, the ideas and computation involved will look familiar. This example will 
illustrate why it is necessary for the coefficients of our polynomials to belong to a field and not 
just a commutative ring. First, we need to find a monomial q;(x) such that the leading term of 
q(x) - (2x? + 1) is equal to the leading term of 3x4 + 8x° + x7 + 7x —5. Certainly, gj (x) = 3x? 
does the trick as 


3 
q(x): (2x? + 1) = 3xt4 5h 


Observe that, in this case, gi (x) € Q[x], but g(x) ¢ Z[x]. In fact, there is no polynomial in 
Z{x] that, when multiplied by 2x7 + 1, produces a leading term of 3x*. Since it is necessary to 
divide by 2 to find q;(x), we can see that our coefficients need to belong to a field. Next, let 


g(x) = (3x4 +8x° +x? +7x—5) —q(x)- (2x? + 1) = 
(3x4 + 8x? +x? +7x—5) — (30 + >*) = 8x? — x +7x—5. 

For the moment, let us suppose that there exist g(x), r(x) € Q[x] such that 
(1) gi (x) = q(x) +(2x7+1)4+r(x) andeither deg(r(x))<2 or r(x) =0. 
Since 

3x44 8x7 42974+7x-5= q(x): (2x? + 1) + 21(x), 
it would follow from (1) that 

3x44 8x 4974+ 7x-5H q(x): (2x? + 1) +2\(xy)= 

qua) (2x* +1) +4400): (20° +1) +7) = (Qi @) +4409) - (2x7 +1) +7@). 


Thus, in this case, g(x) = qi(x) +q«(x) would be the quotient and r(x) the remainder. 
Essentially, this argument illustrates that if all polynomials of degree less than 4 produced an 


446 Chapter 12 


appropriate quotient and remainder when divided by 2x* + 1, then polynomials of degree 4 
would also produce an appropriate quotient and remainder when divided by 2x? + 1. This is 
precisely the type of reasoning used in Mathematical Induction proofs when we show that 
k+1¢€T whenever {1,2,...,k} C 7. In light of this, it should come as no surprise that our 
formal proof of the existence of g(x) and r(x) will use Mathematical Induction. 


We now return to the explicit computation of g(x) and r(x) in our example. Having already 
found q(x) and g;(x), we now need to find a monomial q2(x) such that the leading term of 
qz(x)- (2x* + 1) is the same as the leading term of g)(x) = Sx 5x? +7x —5. Therefore, 
q2(x) = 4x and we let 


1 1 
82(x) = g1(x) — q2(x)- (2x° + 1) = (sx — st +7x-5) — (8x? + 4x) = = 5x +3x-5. 


Next, let g3(x) be a monomial such that the leading term of g3(x) - (2x* + 1) is the same as the 


leading term of g2(x) = — 5x? + 3x —5. Therefore, g3(x) = — and we let 
1 1 1 19 
83(X) = 82(x) — 93(X)- Ce + 1) = (-5* +3x= 5) a (-3* — 7) = 3x—- ra 
Since 3x — ? has smaller degree than 2x? + 1, it will be our remainder, and it is just a matter 


of doing a little bookkeeping to find g(x). Putting the pieces together, we obtain 
3x4 48x79 4-2974+7x-5= q(x): (2x? + 1) + 21(x)= 
qu(x)- (2x* +1) + (q2(@) - (22° +1) + 9209) = 
(qi) +4203) - (2x7 +1) + Ga) - (2x7 +1) +832) = 


(qi (x) +.42(x) +. 93(x))- (2x7 +1) + 93(x) = (5 +4x— 7) (2x7 +1) + (3: S a 
In our example, deg(g1(x)) > deg(g2(x)) > deg(g3(x)) and g3(x) has smaller degree than 
2x* + 1. In all cases, unless some g;(x) = 0, the degrees of the g;(x)’s will always be 
decreasing. Therefore, these will always eventually be some positive integer & such that either 
gx(x) = 0 or deg(g;(x)) is smaller than the degree of the polynomial we are dividing by. The 
polynomial g;(x) is now the remainder, and we can then perform the same type of 
bookkeeping as previously to find g(x). Note that g(x) will always be the sum of the g;(x)’s 
that we found along the way. 


For another example, suppose we wish to divide x? + 5x —4 by 3x — 1. First we need to find 
monomial q,(x) such that the leading term of gq; (x) - (3x — 1) is the same as the leading term of 
x? +5x—4. Therefore, g1 (x) = ix and we let 


g(x) = (x7 +5x—4) —qix)-Bx-1)= (x7 + 5x—4) — (?-52) = sind. 


Polynomials over Arbitrary Fields 447 


Next, g2(x) must be a monomial such that the leading term of g2(x) - (3x — 1) is the same as 
the leading term of g;(x) = Bx —4. Therefore, g2(x) = 4 and we let 


16 16 16 16 20 
8o(x) = (Sx-4) — qo(x)-Bx—-1)= (2x-4) = (Fx- >) =-F: 


Observe that when finding qi (x) and qg2(x), it was necessary to use the multiplicative inverses 
of 3 and 9, once again pointing out the need to be working in a field. Since — 2 has smaller 
degree than 3x + 1, it is our remainder. As noted earlier, our quotient will be 

qi(x) + qo(x) = 5x+ %. This results in 


1 16 20 
2 
5Sx-4=[ = — })-Bx-l)-—, 
x°+5x (5+ >) (3x—1) 9 
as desired. 


Proof. If f(x) = 0, then g(x) = 0 and r(x) = 0 are easily seen to be the quotient and 
remainder. Next, we will use Mathematical Induction to prove that for every nonzero 
polynomial f(x), there always exist a quotient g(x) and remainder r(x) with the desired 
properties. To this end, we will let a(x) 4 0 be an element of F[x] and will let T be the set of 
integers n > O such that every element of F[x] of degree n yields a quotient and remainder 
with the desired properties when divided by a(x). It suffices to show that T contains all 
integers greater than or equal to 0. 


To show that 0 € T, suppose f(x) € F[x] has degree 0. Therefore, f(x) = a, for some 
0 Aae F. If deg(a(x)) > 0, then g(x) = 0 and r(x) = f(x) are our quotient and remainder as 


f(x) =0-a@x) + f(x) and deg(f(x)) < deg(a(x)). 


On the other hand, if deg(a(x)) = 0, then a(x) = 6 for some 0 # £ € F. In this case g(x) = Z 
and r(x) = 0 are our quotient and remainder as 


fis) =F ata) +0 and r(x) =0. 


Thus, in both cases, 0 € T. 


Using the Second Version of Mathematical Induction, it now suffices to show that whenever 
there is an integer k > O such that {0,1,...,k} C T, then T also contains k+ 1. We now 
suppose that f(x) € F[x] has degree k+ 1. There are two cases to consider as either 
deg(a(x)) > deg( f(x)) or deg(a(x)) < deg(f(x)). In the first case, q(x) = 0 and r(x) = f(x) 
are our quotient and remainder as 


f(x) =0-a(x) + f(x) and deg(f(x)) < deg(a(x)). 


448 Chapter 12 


In the second case, we can let ax**! be the leading term of f(x) and Bx’" be the leading term 
of a(x). Since k +1 > m, we can let gi (x) = oe € F[x] and can also let 


(2) gi(x) = f(x) — qi) - aa). 


The leading terms of f(x) and q1(x)- a(x) are both ax*t!, so either g1(x) has degree less than 
k+1 or gi (x) = 0. If deg(gi(x)) <k +1, then deg(gi(x)) € T, and this implies that there exist 
qx(x), r(x) € F[x] such that 


gi(x) =qu(x)-a(x)+r(x) and deg(r(x)) < deg(a(x)) or r(x) =0. 
Along with (2), the previous equation implies that 
f(x) = qi (x) ax) + 81%) = Gi (x) a(x) + g(x) ax) +(x) = 
(qi (x) +4x(x))-a(x)+r(x) and deg(r(x)) <deg(a(x)) or r(x) =0. 
Thus, q(x) = qi (x) +9,(x) is the quotient and r(x) the remainder. 
Finally, if g1 (x) = 0, then g;(x) is the quotient and r(x) =0 the remainder as 
f(x) =qi(x)-a(x) +0 and r(x) =0. 


To complete the proof, we need to show that the quotient g(x) and remainder r(x) are unique. 
Suppose, on two different occasions, we divide f(x) by a(x) and obtain 


f(x) = qi(x) a(x) +r (x) and f(x) = qn(x) -a(x) +1r2(X), 


where r; (x) and 72(x) are both either 0 or have smaller degree than a(x). To prove that the 
quotient and remainder are unique, we need to show that qj (x) = q2(x) and r; (x) = r2(x). 


Since 
qi(x) a(x) +11 (x) = f(x) = q2(x) a(x) +1r2(x), 
if we subtract both g2(x) a(x) and rj (x) from the previous equation, we obtain 
qi (x) - a(x) — qo(x) a(x) = r2(x) — 11). 
Therefore, 
(3) (qi (x) — g2(x)) a(x) = r2(x) — 11 (x). 


If g2(x) — qi(x) # 0, then the fact that a(x) ~ 0 implies that the left-hand side of (3) has 
degree greater than or equal to the degree of a(x). However, since rj (x) and r2(x) are either 0 
or have smaller degree than a(x), it follows that the right-hand side of (3) is 0 or has smaller 


Polynomials over Arbitrary Fields 449 


degree than a(x). This is a contradiction, so it must be the case that q2(x) — q1(x) = 0. Using 


(3), it immediately follows that gj (x) = qo(x) and r1 (x) = r2(x). 


Earlier we indicated that our coefficients need to belong to a field for the division algorithm to 
hold. Our next example should drive this point home. 


m Example—There Is No Division Algorithm in Z[x] 


In Z[x], suppose we wish to divide x? by 2x-+ 1. Observe that if f(x) € Z[x], then the 
leading coefficient of f(x) -(2x+ 1) will always be even. In particular, the leading term 
of f(x)-(2x+1) can never be x”. As a result, the degree of x7 — f(x)-(2x+ 1) will always 
be at least 2. Now suppose that q(x), r(x) € Z[x] such that 


v= q(x): (2x+1)+ r(x). 


Then r(x) = x? — q(x) -(2x+1) and the preceding argument says that the degree of r(x) 
must be at least 2. As a result, it is impossible to find a quotient and remainder in Z[x] 
such that the remainder is equal to 0 or has degree less than 1. On the other hand, since 


there is a division algorithm in Q[x], you can check that when dividing x? by 2x+ 1 in 
1 


Q[x], the quotient is 5x - i and the remainder is 7. 


The division algorithm has many interesting and useful applications. Some of these 
applications appear in the next collection of examples as well as in Corollaries 12.7, 12.8, 
and 12.10. 


m Examples 


1. If we did not have the division algorithm at our disposal, then if we were given 
a(x), b(x) € F[x], with a(x) 4 0, it would not be easy to determine if a(x) was a 
divisor of b(x). However, we can now apply the division algorithm to systematically 
determine whether a(x) is a divisor of b(x). Observe that if a(x) | b(x), then 
b(x) = m(x) -a(x), for some m(x) € F[x]. The uniqueness aspect of the division 
algorithm tells us that (x) is the only quotient and 0 is the only remainder that can 
occur when dividing b(x) by a(x). Therefore, to determine whether a(x) is a divisor of 
b(x), we simply perform long division of polynomials and check whether the 
remainder is 0. If you are uncomfortable with doing long division of polynomials, you 
should review the examples presented before our proof of the division algorithm. 


For example, let us consider whether x7 + 3x +5 is a divisor of 2x3 + 7x? + 13x+5 in 
Q[x]. You should check that long division results in 


2x3 + 7x? + 13x+5 = (2x+1)-(x°+3x4+5). 


www-.elsevierdirect.com 


450 Chapter 12 


Thus, (x7 + 3x-++5) | (x? 4+ 7x? + 13x-+5). 


For another example, let us consider whether [3]sx? + [2]sx+[1]s is a divisor of 
[2]sx3 + [4]5x2 +x in Zs[x]. If you perform long division, you will see that 


[2]sx° + [4]5x? +x = ([4]5x + [2]5) - ([3]5x7 + [2]sx + [1]s) + (3]sx + [3]s). 


Since the division algorithm yields a nonzero remainder, we can see that 
([3]sx° + [2]sx+[1]5) J (2]5x° + [4]5x7 +2). 


2. Another question we briefly discussed before proving the division algorithm was if 
a(x) is not a divisor of b(x) in Fx], is it possible that a(x) is a divisor of b(x) in K[x], 
where K is a field that contains F? Observe that the division algorithm asserts that 
there exist g(x), r(x) € F[x] such that 


(4) b(x) = q(x)-a(x)+r(x) where deg(r(x)) < deg(a(x)) or r(x) =0. 


If a(x) | b(x) in K[x], then there exists some m(x) € K[x] such that D(x) = m(x)- a(x). 
Since F C K, the previous equation and (4) provide us with two ways to divide b(x) by 
a(x) in K[x]. However, the uniqueness aspect of the division algorithm asserts that each 
time we divide b(x) by a(x) in K[x], we must obtain the same quotient and remainder. 
Thus, m(x) = q(x) € F[x] and r(x) = 0. Hence, a(x) | b(x) in F[x]. As a result, we now 
see that a(x) is a divisor of b(x) using the larger field K only if it was already a divisor of 
b(x) using the smaller field F. 

Lae a 


The first corollary of the division algorithm will be useful, not only in this chapter but also in 
later examinations of fields and roots of polynomials. 


Corollary 12.7. Suppose F C K are fields, a € K, and g(x) is a polynomial of smallest 
possible degree in F(x] that has a as a root. If f(x) € F[x] has a as a root, then g(x) must be a 
divisor of f(x) in F[x]. 


At first glance, Corollary 12.7 looks like a very abstract statement. However, it can easily be 
applied to some very concrete examples. In light of this, we will present some of these 
examples before proving the corollary. This will enable us to have a greater understanding and 
appreciation of this corollary before we work through its proof. Examples 2-5 will provide 
proofs to various statements made back in Chapters 5 and 9. 


Examples 


1. Sincei ¢g Rand i*+1=0, we see that x7 + 1 is a polynomial of smallest possible 
degree in R[x] that has 7 as a root. Therefore, Corollary 12.7 now asserts that any 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 451 


f(x) € R[x] that has i as a root is a multiple of x? + 1. Combining this with the fact 
that any multiple of x7+ 1 must have i as a root, we now know that f(x) € R[x] has i 
as a root if and only if x + 1 is a divisor of f(x). 


Our next example generalizes the previous one. Suppose a € C such that a ¢ R; 
then there is no polynomial of degree 1 in R[x] with a as a root. However, a is 
certainly a root of 


g(x) = (x—a): (x-0*) = x? — (a +0*)x+ 00", 
where a* is the complex conjugate of a. By Lemma 5.10(b), a+ a*, aa* € R. In light 
of this, g(x) is a polynomial of smallest possible degree in R[x] that has @ as a root. 
With the help of Corollary 12.7, we can now assert that f(x) € R[x] has @ as a root if 
and only if f(x) is a multiple of g(x). 


Now suppose that f(x) is any polynomial in R[x] of degree at least 3. The 
Fundamental Theorem of Algebra guarantees that f(x) has some root a € C. If 

a ¢ IR, then the polynomial g(x) just constructed is a divisor of f(x) of degree 2 in 
R[x]. Therefore, in this case, f(x) is reducible in R[x]. On the other hand, ifa € R, 
then x —@ is a polynomial of smallest possible degree in R[x] having @ as a root, so 
Corollary 12.7 now asserts that x —a Is a divisor of f(x) in R[x]. Once again, we can 
see that f(x) is reducible in R[x]. 


In light of the preceding argument, we have now proven the statement, first made in 
Chapter 9, that every polynomial of degree at least 3 in R[x] is reducible in R[x]. 


The number 5 — 8V2 is an element of the field Q(/2) but it does not belong to Q. 
Therefore, 5 — 84/2 cannot be the root of any polynomial of degree 1 in Q[x]. 
However, it is not hard to check that 5— 82 is a root of the polynomial 

x* — 10x — 103 € Q[x]. Thus, x? — 10x — 103 is a polynomial of smallest possible 
degree in Q[x] that has 5 — 8./2 as a root. Therefore, Corollary 12.7 asserts that any 
f(x) € Q[x] that has 5 — 8./2 as a root must bea multiple of x7 —10x— 103. 


The numbers —} +5,/2 are elements of the field Q(/2) that do not belong to Q. 
Therefore, neither of —5 +5,/2 can be the root of any polynomial of degree 1 in 
Q[x]. However, it is not hard to check that —5 +5,/2 are the roots of the polynomial 
4x? +4x— 199 € Q[x]. Thus, 4x? +4x — 199 is a polynomial of smallest possible 
degree in Q[x] that has either —5 +5,/2 as a root. Therefore, Corollary 12.7 asserts 
that any f(x) € Q[x] that has either 4 +5,/2 asa root must bea multiple of 

4x? + 4x — 199. 


The numbers 7 £ivV3 are elements of the field C that do not belong to Q. Therefore, 
neither of 7 tiv 3 can be the root of any polynomial of degree 1 in Q[x]. 
However, it is not hard to check that 7+i¥/3 are the roots of the polynomial 


www.elsevierdirect.com 


452 Chapter 12 


x? — 14x +52 € Q[y]. Thus, x” — 14x +52 is a polynomial of smallest possible 
degree in Q[x] that has either 7+iV/3 asa root. Therefore, Corollary 12.7 asserts 
that any f(x) € Q[x] that has either 7 +i/3 as a root must be a multiple of 

x? — 14x +52. 


6. Let us consider the number 23 € Rand let g(x) be a polynomial of smallest possible 
degree in Q[x] that has 23 asa root. We know that x3 —2 has 23 asa root, and using 
either Eisenstein’s Criterion or the Rational Root Test, we know that x? — 2 is 
irreducible in Q[x]. But Corollary 12.7 asserts that g(x) is a divisor of x3 —2. The 
irreducibility of x° — 2 now implies that g(x) and x — 2 must have the same degree, 
so x° —2 = ag(x), for some nonzero a € F. Thus, x7 —2isa polynomial of smallest 
degree in Q[x] having 23 asa root. Asa result, if f(x) € Q[x] has 23 asa root, 
then f(x) must be a multiple of x? —2. 


Proof of Corollary 12.7. If f(x) € F[x] has @ as a root, we can apply the division algorithm to 
obtain g(x), r(x) € F[x] such that 


S(x) = q(x): g(x) +r(x) where deg(r(x)) < deg(g(x)) or r(x) =0. 
Therefore, r(x) = f(x) — q(x) - g(x) and plugging a into this equation yields 
r(a) = f(a) —q(a)- g(a) =0—q(a)-0=0. 


Thus, r(x) also has @ as a root. However, it is impossible for r(x) to have smaller degree 
than g(x), since g(x) has the smallest possible degree among polynomials in F'[x] that 
have @ as a root. Therefore, the only remaining possibility is that r(x) = 0. As a result, 
F(x) = q(x) - g(x), So g(x) is a divisor of f(x). 


In an example that preceded the proof of Corollary 12.7, we showed that all polynomials of 
degree at least 3 in R[x] are reducible in R[x]. Combining this with the existence portion 
of Theorem 12.4, it immediately follows that 


Corollary 12.8. Every polynomial in R[x] of degree at least 1 can be written as a product of 
linear polynomials and irreducible quadratic polynomials in R[x]. 


It is important to realize that although every f(x) in R[x] can be factored into a product of 
linear and irreducible quadratics in R[x], there is, in many cases, no algorithm for finding 
these factors. For example, in Section 10.2 we pointed out that the polynomials x° — 6x +3, 
x —4x +2, and x° — 8x +6 all have exactly three real roots. Therefore, in R[x], each of these 
polynomials can be factored in R[x] into a product of three linear factors and one irreducible 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 


quadratic. However, as noted in Section 10.2, Galois’ work on the insolvability of the 
quintic implies that there is no algorithm for finding these factors that only involves various 


combinations of the polynomial’s coefficients using addition, subtraction, multiplication, 
division, and taking nth roots, for various n € N. 


Exercises for Sections 12.1 and 12.2 


1. 


Let f(x) = x4 —7x? — 44. 


(a) Factor f(x) completely in Q[x]. 
(b) Factor f(x) completely in R[x]. 
(c) Factor f(x) completely in C[x]. 


Let g(x) = x* +46. 


(a) Factor g(x) completely in Q[x]. 
(b) Factor g(x) completely in R[x]. 


(c) Factor g(x) completely in C[x]. 


Let h(x) = 2x7 +3x? —9x—5. 


(a) Factor h(x) completely in Q[x]. 
(b) Factor h(x) completely in R[x]. 
(c) Factor h(x) completely in C[x]. 


Let j(x) = x44+403 +42? —4. 


(a) Factor j(x) completely in Q[x]. 
(b) Factor j(x) completely in R[x]. 


(c) Factor j(x) completely in C[x]. 


Let k(x) = x4 -—15x+14. 


(a) Factor k(x) completely in Q[x]. 
(b) Factor k(x) completely in R[x]. 
(c) Factor k(x) completely in C[x]. 


Let [(x) = x4 —15x421. 


(a) Factor k(x) completely in Q[x]. 


453 


(b) How many irreducible factors are there when /(x) is factored completely in R[x]? 


(c) How many irreducible factors are there when /(x) is factored completely in C[x]? 


454 Chapter 12 
7. Let m(x) = 2x? + 3x? —9x—-6. 
(a) Factor m(x) completely in Q[x]. 
(b) How many irreducible factors are there when m(x) is factored completely in R[x]? 
(c) How many irreducible factors are there when m(x) is factored completely in C[x]? 


8. Let w(x) = 2x7 +3x? —9x+ 12. 
(a) Factor w(x) completely in Q[x]. 


(b) How many irreducible factors are there when w(x) is factored completely in R[x]? 
(c) How many irreducible factors are there when w(x) is factored completely in C[x]? 
For exercises 9-10, you might first want to refer to Theorem 9.15. 


9. Let a(x) =x*-—61x7 +1. 
(a) Factor a(x) completely in Q[x]. 


(b) Factor a(x) completely in R[x]. 
(c) Factor a(x) completely in C[x]. 


10. Let bx) =x4+61x? +1. 
(a) Factor b(x) completely in Q[x]. 


(b) Factor b(x) completely in R[x]. 
(c) Factor b(x) completely in C[x]. 
In exercises 11-16, completely factor the given polynomial in Z3[x]. 
ll. x7+x+4+[1) 
12. [2]3x° + [2]5x?+x+11] 
13,. x +274 x4 [1] 
14. [2]3x° +x? + [2]3x+[1]s 
15. [2]3x7+x+[1]3 
16. x8+x+4+[1] 
In exercises 17-22, completely factor the given polynomial in Zs[x]. 
17. e+ Bis 
18. x°+[4]sx+(Hs 
19. [3]}5x*+[2]sx+[1]s 


Polynomials over Arbitrary Fields 455 


20. x3 +([3]sx* + [3]sx+ [4]s 

21. [2]sx3 + [4]5x* + [2]s 

22. [4]sx? + [2]5x°+x+[3]s 

In exercises 23-28, completely factor the given polynomial in Z7[x]. 

23. x?+([3h 

24. x7+[4]7 

25. x°+[2hx+[5h 

26. x?+([2]7x+[6]7 

27. x3 +[2)x*+[Shx+[3]7 

28. x°+[6]7x? + [4]7 

In exercises 29-32, find the quotient and remainder in Zs[x] when f(x) is divided by a(x). 
29. f(x) = [4]sx* + x+ [2]s and a(x) = x + [3]s 

30. f(x) =[4]sx* +x+[2]5 and a(x) = x+[4]5 

31. f(x) = [2]5x? + [4]sx + [3]s and a(x) = x+[2]s 

32. f(x) = [2]sx° + [4]sx+ [3]s and a(x) = [2]5x* + [1]s 

In exercises 33-36, find the quotient and remainder in Z,;[x] when f(x) is divided by a(x). 
33. f(x) = [7x7 + [8]ix+[10]11 and a(x) = x+ [5] 

34. f(x) =[TMix* + [8]rrx + [10]) and a(x) = x4 [Tu 

35. f(x) = [S]nx? + B)ix* + [6] and a(x) = x+ [8] 

36. f(x) = [5]x? + 3)ix? + [6] and a(x) = [4]11x7 + [8] 1x 


In exercises 37-39, p is a prime number. For exercises 38 and 39, it will probably be easier to 
first count the number of monic, reducible polynomials of the desired degree. 


37. Determine the number of monic, irreducible polynomials of degree | in Z,[x]. 
38. Determine the number of monic, irreducible polynomials of degree 2 in Z,[x]. 
39. Determine the number of monic, irreducible polynomials of degree 3 in Z,[x]. 


40. Let f(x) =x" + dy_1x"~!+---+a,x+ao € F[x], where F is a field, and suppose 
Q1,Q@2,...,@, € F are the n (not necessarily distinct) roots of f(x). 


456 Chapter 12 


41. 


42. 


43. 


44, 


45. 


46. 


47. 
48. 


49. 


(a) Express a,_, in terms of the q;. 
(b) Express ao in terms of the q;. 


Let F(x) = x4 + 6x3 —7x—3. 
(a) In C, what is the sum of the roots of F(x)? 


(b) In C, what is the product of the roots of F(x)? 


Let G(x) = 3x° —4x3 + I1x— 19. 
(a) In C, what is the sum of the roots of G(x)? 


(b) In C, what is the product of the roots of G(x)? 


Let A(x) = [6]7x° + [4]7x° + [2]7x7 + [3]7x + [5]7 € Z7[x] and suppose F is a field 
which contains Z7 such that all the irreducible factors of H(x) in F are of degree 1. 
(a) In F, what is the sum of the roots of H(x)? 


(b) In F, what is the product of the roots of H(x)? 


Let f(x) = (x? —5)3 (x7 +3)? € Qi]. 
(a) In Q[x], how many different monic polynomials are divisors of f(x)? 


(b) In R[x], how many different monic polynomials are divisors of f(x)? 
(c) In C[x], how many different monic polynomials are divisors of f(x)? 


Let g(x) = 24(x+3)4 (x — 2)3 (x? +2x +3)°(x3 —7) € Q[x]. 
(a) In Q[x], how many different monic polynomials are divisors of g(x)? 


(b) In R[x], how many different monic polynomials are divisors of g(x)? 
(c) In C[x], how many different monic polynomials are divisors of g(x)? 


Let h(x) = (4x —5)?(2x? + 1)° (3x? —7)* € Q[x]. 
(a) In Q[x], how many different monic polynomials are divisors of h(x)? 


(b) In R[x], how many different monic polynomials are divisors of h(x)? 
(c) In C[x], how many different monic polynomials are divisors of h(x)? 
In Z2[x], how many different monic polynomials divide x(x+[1]o)? (x? +44 [1]2)3? 


Suppose a field F has n elements and F = {aj, a2, ...,d,}. Show that the polynomial 
w(x) = (x — a,)(x — az) --- (x —a,) + IF has no roots in F,, where 1 denotes the 
multiplicative identity in F’. 


Show that if a field F has n elements and m > n, then there exists some polynomial 
g(x) € F[x] of degree m such that g(x) has no roots in F. 


Polynomials over Arbitrary Fields 457 


12.3 Irreducible and Minimum Polynomials 


Let us now consider the situation where F C K anda K is the root of some nonzero 
polynomial in F[x]. If g(x) and h(x) are both polynomials of smallest possible degree in F'[x] 
which have @ as a root, then Corollary 12.7 asserts that g(x) | h(x) and h(x) | g(x). This 
immediately implies that g(x) and h(x) have the same degree, so there is some 8 € F such that 
h(x) = Bg(x). If y is the leading coefficient of g(x), then m(x) = y~!g(x) is a monic 
polynomial in F[x] that has the same degree as g(x) and also has @ as a root. Combining these 
facts, we have h(x) = (By~!)m(x). This tells us that any polynomial in F[x] of smallest 
possible degree that has a as a root must be equal to an element of F times m(x). In particular, 
if w(x) € F[x] is also a monic polynomial of this same degree having @ as a root, then w(x) 
must be equal to m(x). In light of this observation, m(x) is the unique monic polynomial in 
F[x] of this minimal degree that has @ as a root. The polynomial m(x) is often referred to 

as the minimum polynomial for a over F and minimum polynomials will be of great 
importance in our examination of field extensions in Chapter 15. If we are given some a € K, 
it is often quite easy to find monic polynomials in F'[x] that have a as a root. However, at 

this point, it is not clear how to determine which of these monic polynomials is the minimum 
polynomial. Recall that before the proof of Corollary 12.7, we observed that x* — 2 is a monic, 
irreducible polynomial in Q[x] that has 23 as a root. We then used the irreducibility of x? —2 
to show that x* — 2 is the minimum polynomial for 23 over Q. Indeed, as the next lemma will 
confirm, irreducibility is a criterion that can be used to determine if a polynomial is the 
minimum polynomial. 


Lemma 12.9. Let F C K be fields and let a € K. If m(x) € F[x] is monic and has a as a root, 
then m(x) is the minimum polynomial for a over F if and only if m(x) is irreducible in F [x]. 


Proof. In one direction, suppose m(x) is the minimum polynomial for a over F. By way of 
contradiction, suppose m(x) is reducible in F'[x]. Thus, m(x) = a(x) - b(x), where 
a(x), b(x) € F[x] and deg(a(x)), deg(b(x)) < deg(m(x)). Plugging in a, we see that 


0=m(a) =a(a)-b(a). 


Since F has no zero divisors, this implies that a(@) = 0 or b(a) = 0. However, this says 

that either a(x) or b(x) has smaller degree than m(x) and has @ as a root. But this contradicts 
the fact that m(x) has the smallest possible degree from among all polynomials in F[x] which 
have @ as a root. Thus, we can conclude that m(x) is indeed irreducible in F[x]. 


In the other direction, suppose m(x) € F[x] is monic, irreducible, and has a@ as a root. If g(x) is 
the minimum polynomial for w over F,, then Corollary 10.7 implies that g(x) | m(x) in F[x]. 
However, since m(x) is irreducible, it must be the case that deg(g(x)) = deg(m(x)), so 

m(x) = yg(x), for some y € F. But since m(x) and g(x) are both monic, this immediately 
implies that m(x) = g(x). As a result, m(x) is the minimum polynomial for @ over F. 


458 Chapter 12 


m Examples 


: 1 : : — ee 
1. Consider 75 € R; certainly x° —7 is a monic polynomial in Q[x] having 75 as a root. 
By Eisenstein’s Criterion, x° —7 is irreducible over Q. Therefore, x> —7 is the 
ares 5 1 
minimum polynomial for 75 over Q. 


2. Consider i € C; certainly x? +1 is a monic polynomial in R[x] having i as a root. 
Since x* + 1 is irreducible over R, it is the minimum polynomial for i over R. 


3. Givena=5+7i€ C, let us try to find the minimum polynomial for 5+ 7i over Q and 
over R. The first thing we need to do is to find polynomials in Q[x] and R[x] that 
have 5+ 7i as a root. Then we will check if the polynomials we found are irreducible. 
Observe that if a = 5+ -7i, then a@—5 = 7i and squaring both sides yields 


o — 10a +25 = —49. 


Thus, a — 10~7+74 =0, and we see that 5+7i is a root of x7 — 10x +74. Since 
x? — 10x +74 has no real roots, it is irreducible over both Q and R. As a result, 
x* — 10x +74 is the minimum polynomial for 5+ 7i over both Q and R. 


4. Considera=J/2+V/3€ IR; we would like to find the minimum polynomial for 
J2+ 3 over Q. To find a monic polynomial in Q[x] which has J/2+/3 asa root 
will require some computations. If a = af ote /3, then 


a? = (/2+ V3)" =24+2V643=542V6. 
This implies that 
a —5=2V6 
and squaring both sides yields 
a — 100° +25 = 24. 


Thus, a4 — 10a7+1=0. Asa result, /2+ V3 is a root of x* — 10x? +1 € Q[x]. 
However, at this point, it is not clear if xt — 10x* + 1 is the minimum polynomial for 
J/2+4/3 over Q. In light of Lemma 12.9, it is sufficient to determine if x4 — 10x? +1 is 
irreducible over Q. Note that neither the Rational Root Test nor Eisenstein’s Criterion 
gives us any information at this point. On the other hand, x*— 10x? +1 is one of the 
polynomials described by Lemma 9.12. If we look back at Lemma 9.12, we have 

A= 10, A+2 = 12, and A—2 =8, so the theorem tells us that x4 — 10x* + 1 is 
irreducible over Q. Hence, x* — 10x” +1 is the minimum polynomial for /2+ /3 

over Q. 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 459 


Now let us look at a related question, what is the minimum polynomial for /2+ V3 over the 
field Q(/2)? It is not difficult to find a quadratic polynomial in Q(V2) [x] which has /2+ /3 
as a root. To see this, observe that 


a-V2=V3 
and squaring both sides gives us 
a —2/2a+2 =3. 
This implies that 
a —2V2a—1=0, 


which tells us that /2 + V3 is a root of x? —2/2x—le Q(V2) [x]. To determine if 
x? —2,/2x —1 is the minimum polynomial for J/2+/3 over Q(/2), we must determine if 
x? —2./2x — 1 is irreducible in Q(/2)[x]. 


You can easily check that /2+4/3 are the roots of x2 —2./2x — 1. Therefore, the only way 

x? —2,/2x — 1 could be reducible over Q(./2) would be for /2 + V3 to belong to Q(V2). 

In particular, this would imply that 3 € Q(./2). There are several ways to determine whether 
V3 belongs to Q(/2). Our approach will be to apply some of the tools involving 
automorphisms and roots of polynomials introduced in Chapter 5. Recall that the function o 
defined as o(a + bV/2) = a—bvV2, for all a,b € Q, is an automorphism of Q(V2). If 

Wee Q(/2), then there exist a, b € Q such that J/3 =at+by2. Since V3 is a root of 

x? —3 € Q[x], Corollary 5.13 asserts that o(+/3) is also a root of x? — 3. Therefore, either 


o(V3) = V3 or o(V3) = — V3. 


In the first case, 
V3=a+bV2 and V3= o(V3) = o(a+bv2) =a—bvV2. 
Adding these two equations yields 


2/3 = 2a, 


which implies that 3 = a € Q. But this is a contradiction as \/3 is not rational. In the second 
case, we have 


V3=a+bV2 and —-V3= o(V3) = o(a+bv2) =a—bv2. 
Subtracting the second equation from the first yields 


2/73 = 2bV2. 


460 Chapter 12 


Multiplying both sides of this equation by /2 and dividing by 2 results in 
V6=2beEQ. 


But this is also a contradiction as V6 is not rational. In light of this, x7 — 2./2x — 1 is indeed 
irreducible over Q(/2), so x? —2\/2x — 1 is the minimum polynomial for J/2+4/3 over 
Q(v2). 


If F is a field and a € F such that x —a@ is a divisor of some f(x) € F[x], then certainly 

f(a) = 0. However, observe that x — a is the minimum polynomial for a over F’. Therefore, 
Corollary 12.7 asserts that x — @ is a divisor of every polynomial in F'[x] that has @ as a root. 
We have now proven a familiar fact that we record as 


Corollary 12.10. Suppose F is a field, a € F, and f(x) € F[x]. Then a is a root of f(x) if and 
only if x —a is a divisor of f(x). 


12.4 Euclidean Algorithm and Greatest Common Divisors 


In order to prove the uniqueness part of Theorem 12.4, we return to the path we followed in 
Chapter 3 and begin by examining greatest common divisors in F[x]. In Z, the greatest 
common divisor of nonzero integers a, b was the largest integer that was a divisor of both a 
and b. Therefore, one might suspect that, in F[x], we would define the greatest common 
divisor of nonzero polynomials a(x), b(x) to be the polynomial of largest degree that is a 
divisor of both a(x) and b(x). Let us now consider what this would mean if we looked at the 
polynomials 30x* — 30x? and 70x* — 70x” in Q[x]. There are many polynomials of degree 3 in 
Q[x] that are divisors of both 30x* — 30x? and 70x* — 70x?. For example, Aa, I Se: 
5x3 — 5x, and 10x? — 10x? are all common divisors of 30x* — 30x? and 70x* — 70x2. In fact, 
for every nonzero a € Q, the polynomial wx? — wx? is a common divisor of 30x* — 30x? and 
70x* — 70x?. Therefore, there is certainly no unique polynomial of degree 3 in Q[x] that is a 
common divisor of 30x* — 30x? and 70x* — 70x”. However, all of the degree 3 polynomials 
that are common divisors of 30x* — 30x? and 70x* — 70x? are of the form a- (x? — x”), where 
a is a nonzero element of Q. Thus, x* — x? is the only monic polynomial of degree 3 that is a 
common divisor of 30x* — 30x? and 70x* — 70x? in Q[x]. As a result, it appears that we 
should define the greatest common divisor of nonzero polynomials a(x), b(x) € F[x] to be the 
unique monic polynomial of largest degree that is a common divisor of a(x) and D(x). 
However, even now, we are not entirely finished with the problem of uniqueness. 


In Z, if c 4d, then either c > d or c < d. This means that if c and d are two different common 
divisors of a and b, then either c or d is a greater common divisor than the other. However, 
the situation is quite different in F[x] as a polynomial can have many monic divisors of the 
same degree. For example, if we go back to 30x4* — 30x? and 70x* — 70x’, both x? and x? — x 
are common divisors that are monic and have the same degree. Therefore, at this point, given 


Polynomials over Arbitrary Fields 461 


a(x), b(x) € F[x], it is not obvious that there will be a unique monic polynomial of largest 
degree that is a common divisor of a(x) and b(x). Fortunately, using our work in Z as a guide, 
there is another way to view greatest common divisors in F'[x]. In Z, it was very helpful to 
consider gcd(a, b) as the smallest positive integer that could be written in the form s-a+t-b, 
where s, t € Z. This motivates the following. 


Lemma 12.11. Jf a(x), b(x) € F[Xx] then there is a unique monic polynomial c(x) of smallest 
possible degree that can be written in the form c(x) = s(x) - a(x) + t(x) - b(x), where 

S(x), t(x) € F[x]. 

Proof. Let S be the set of nonnegative integers n with the property that there exists some 

d(x) € F[x] of degree n such that d(x) = s(x) - a(x) + t(x)- b(x), for some s(x), t(x) € F[x]. 
Since S is certainly nonempty, the Well Ordering Principle guarantees that S contains a 
smallest element m > 0. Therefore, there exist some d(x) € F[x] of degree m such that 

d(x) = s(x) -a(x) + 1t(x)- b(x), where s(x), t(x) € F[x]. If we let a be the leading coefficient of 
d(x), then the previous equation implies that 


a !d(x) = (a! s(x)) -a(x)+ (a~!t(x)) -b(x). 
Next, if we let c(x) = a7 !d(x), s1(x) = a7!s(x), and t)(x) = a~'t(x), then we now have 
(5) C(x) = 51 (x) -a(x) +11 (x) - bx). 
To complete the proof of this lemma, we need to show that c(x) is the only monic polynomial 


of degree m in F'[x] that can be written as a multiple of a(x) plus a multiple of b(x). To this 
end, suppose 


(6) e(X) = 52(x) - a(x) + t2(x) - (x) 


is a monic polynomial of degree m, where s2(x), t2(x) € F[x]. Subtracting equation (6) from 
equation (5) results in 


c(x) — e(x) = (51 (x) — 52(x)) a(x) + (1) — 2x) - (). 
Since both c(x) and e(x) have leading term x", we see that either 
deg(c(x) —e(x))<m_ or c(x)—e(x) =0. 


In the first case, c(x) — e(x) has degree less than m, yet it can be written as a multiple of a(x) 
plus a multiple of b(x). This contradicts the fact that m is the smallest integer in S. Therefore, 


462 Chapter 12 


the only remaining possibility is that c(x) — e(x) = 0. Thus, c(x) = e(x) and c(x) is indeed the 
unique monic polynomial of smallest degree that can be written as a multiple of a(x) plus a 
multiple of b(x). 


Lemma 12.11 tells us of the existence of the polynomial c(x), but it doesn’t provide us with an 
algorithm for finding it. In order to show that c(x) has the additional property of being the 
unique monic polynomial of largest degree in F[x] that is a common divisor of a(x) and b(x), 
we will first need to develop an algorithm for finding c(x). This algorithm will enable us to 
better understand the properties possessed by c(x). It should come as no surprise that the 
algorithm will be very similar to the Euclidean Algorithm we saw in Chapter 3, but this 
version applies to F[x]. But first, we need the analog of Lemma 3.7 for F[x]. 


Lemma 12.12. Let a(x), b(x), s(x), t(x), and c(x) belong to F[x] such that c(x) is a divisor 
of a(x) and b(x). Then c(x) is also a divisor of s(x) - a(x) +t(x)- b(x). 


Proof. Since c(x) is a divisor of both a(x) and b(x), there exist polynomials u(x) and v(x) 
such that a(x) = u(x)-c(x) and b = v(x) -c(x). We now have 


5(x)- a(x) + t(x) - D(x) = s(x) - (U(x) -c(x)) +1) - (U(x) - c(x)) = 
(s(x) - u(x) +£(x) - v(x)) + c(x). 


As a result, c(x) is a divisor of s(x) - a(x) +1t(x) - D(x). 


The next result develops the Euclidean Algorithm for F'[x] and proves that the polynomial 
c(x) from Lemma 12.11 is indeed the greatest common divisor of a(x) and b(x). The proof 
resembles the proof of Theorem 3.8. 


Theorem 12.13—The Euclidean Algorithm and Greatest Common Divisors in F[x]. Jf 
a(x), b(x) € F[x] are nonzero, let c(x) be the monic polynomial of smallest possible degree in 
F [x] that can be written as a multiple of a(x) plus a multiple of b(x). Then 


(a) c(x) isa common divisor of a(x) and b(x), 


(b) c(x) is a multiple of every other common divisor of a(x) and b(x) and therefore has the 
largest degree of any common divisor of a(x) and b(x), and 


(c) c(x) is the only monic polynomial of its degree that is a common divisor of a(x) and b(x). 
We call c(x) the greatest common divisor of a(x) and b(x) and denote it as gcd(a(x), b(x)). 
Proof. The proof consists of developing an algorithm to find the polynomial discussed in 


Lemma 12.11. If at any point you have difficulty understanding this proof, you should go back 
and review the more concrete proof of Theorem 3.8. We begin by applying the division 


Polynomials over Arbitrary Fields 463 
algorithm to a(x) and b(x) to obtain polynomials g;(x) and r;(x) such that 
b(x) =qi(x)-a(x)+ri(x) and deg(ri(x)) < deg(a(x)) or r(x) =0. 
If r(x) 4 0, divide a(x) by 7; (x) to obtain polynomials g2(x), r2(x) such that 
a(x) = q2(x)-riQa) +r2(a) and deg(r2(x)) < deg(ri(x)) or r2(x) = 0. 
Next, if ro(x) 4 0, divide 7; (x) by r2(x) to obtain polynomials g3(x), r3(x) such that 
r(x) = 93(x) -ro(x) +1r3(x) and — deg(r3(x)) < deg(r2(x)) or = r3(x) =0. 


Observe that every time we apply the division algorithm, we obtain a remainder that is either 
equal to 0 or has smaller degree than the remainder in the previous step. In particular, if none of 
our remainders are yet equal to 0, we have deg(a(x)) > deg(r\(x)) > deg(r2(x)) > deg(r3(x)). 
Therefore, if we continue this process of dividing remainder r;(x) by the next remainder 
ri41(x), we will eventually obtain a remainder of 0. Let us now suppose that n is the positive 
integer such that r, (x) is the /Jast remainder that is not 0. This says that if we continue to apply 
this procedure, we will eventually obtain the equations 


Tn—2(X) = Gn(X)-Pn—1 (x) rn) and rn (X) = Gn41 (x): (x) +0, 
where 


deg(a(x)) > deg(r1(x)) > deg(r2(x)) > deg(r3(x)) > +++ > 
deg (Tn—2(x)) > deg(rn—1(x)) > deg(rn(x)). 


The last equation tells us that 7, (x) is a divisor r,_1(x). Applying Lemma 12.12 to the next to 
last equation tells us that 7,,(x) is a divisor of r,-2(x). We can continue to move upward 
through our list of equations, and if we apply Lemma 12.12 at every step, we see that 


n(x) | Pn-10%), tn) | tn-2), tn) | tn—-3 0), «Pn () | 12), 


n(x) |ri(x),  rn(x) | a(x), rn (x) | DX). 


Therefore, 7, (x) is acommon divisor of a(x) and b(x). Observe that if @ is the leading 
coefficient of 7, (x), then the polynomial c(x) = a'r, (x) is monic and is also a common 
divisor of a(x) and b(x). Therefore, c(x) satisfies property (a). 


The next to last equation shows that 7, (x) can be written as a multiple of r,_2(x) plus a 
multiple of r,_; (x). Moving up to the next equation enables us to replace r,_; (x) by 

a multiple of r,-3(x) plus a multiple of r,-2(x), which shows that 7, (x) can be written as a 
multiple of r,—3(x) plus a multiple of r,—-2(x). Continuing in this way, we eventually see that 


464 Chapter 12 


r,(x) can be written as a multiple of a(x) plus a multiple of 7; (x), and, finally, 7, (x) can be 
written as a multiple of b(x) plus a multiple of a(x). Therefore, we can write 


Tn (x) = S(x)- a(x) + t(x) d(x), 
where s(x), t(x) € F[x]. But if we multiply this equation by a, we see that 
C(x) = a! rp (x) = (| 9(x)) ax) + (@'1(x)) DQ). 


Thus, c(x) can also be written as a multiple of b(x) plus a multiple of a(x). 


If d(x) is any common divisor of a(x) and b(x), then Lemma 12.12 asserts that d(x) is a 
divisor of any multiple of a(x) plus a multiple of b(x). Therefore, d(x) is also a divisor of c(x) 
and the degree of c(x) must be greater than or equal to the degree of d(x). In addition, if d(x) is 
monic and has the same degree as c(x), then c(x) = d(x). Thus, c(x) also satisfies properties 
(b) and (c). 


Having succeeded in showing that c(x) can be written as a multiple of a(x) plus a multiple of 
b(x), it now suffices to show that it has the smallest degree of any monic polynomial that can 
be written this way. This will show that c(x) is indeed the polynomial discussed in 

Lemma 12.11. To this end, suppose f(x) is a polynomial that can be also written as a multiple 
of a(x) plus a multiple of b(x). Since c(x) is a common divisor of a(x) and b(x), Lemma 12.12 
implies that c(x) is also a divisor of f(x). However, since c(x) is a divisor of f(x), we see 

that the degree of c(x) is less than or equal to the degree of f(x). Thus, c(x) is indeed the 
monic polynomial of smallest possible degree that can be written as a multiple of a(x) plus a 
multiple of b(x). As a result, c(x) does indeed have all the properties we would expect from a 
greatest common divisor. 


m Examples 


1. Before looking at the Euclidean Algorithm in F[x], we looked briefly at the problem 
of finding the greatest common divisor of 70x* — 70x? and 30x* — 30x? in Q[x]. We 
now use the Euclidean Algorithm to complete the problem. First divide 70x* — 70x? 
by 30x* — 30x? to obtain 


(7) 70x* — 70x? = (5): Gor" 30x) + (7027 — 70x”). 


Next, we divide 30x* — 30x? by 70x — 70x? to obtain 


3 : , 
30x* — 302° = (5+) -(70x* — 70x") +0. 


Polynomials over Arbitrary Fields 465 


Since 70x? — 70x? is the last nonzero remainder, it would be the greatest common 
divisor but for the fact that it is not monic. However, multiplying 70x* — 70x? by + 
we see that x9 —x?7 = gcd(70x* — 70x”, 30x* — 30x3). 


In order to write x? — x? as a multiple of 70x+ — 70x? plus a multiple of 30x4 — 30x°, 

we will go back through our equations and first write 70x? — 70x? in this form. Then 

we will multiply our equation by + to complete the problem. In particular, it follows 
from (7) that 


7 
70x? — 70x? = 1. (70x* — 70x?) + (- *) - (30x* — 30x°). 
Multiplying this equation by + yields 
1 1 
a a 70" (70x* = 70x") + (-5) : (30.4 — 30x°), 


as desired. 


Let us now compute gcd(x* — x? —2, x+-+ x? —6) in Q[x]. First, we divide x* — x? —2 
by x++ x? —6 to obtain 


(8) x43? -2=1- (x4 42° -6) + (2x7 +4). 
Next, we divide x* + x? —6 by —2x” +4 to obtain 
Iie. 3 
x +x-6= (->°- >) -(—2x7 +4) +0. 


Therefore, all that remains to find the greatest common divisor is to multiply 
—2x? +4 by an appropriate number to obtain a monic polynomial. Thus, 
x? —2 = gcd(x* — x? —2, x4 +x? —6). 


To write x7 —2 as a multiple of x* — x? —2 plus a multiple of x* + x? — 6, we can 
rewrite (8) as 


—2x° +4 =1-(x*-x? —2)4+(-1)- (x47 +x? -6). 


Multiplying this equation by —} gives us 


?-2= (->)-@*-x?-2)4 (5)-@*47-9), 


as desired. 


For another example, let us find the greatest common divisor of x? +x? +x and 
[2]5x3 + [2]5x? in Zs[x]. First, we divide x3 + x7 +x by [2]5x? + [2]sx2 to obtain 


4x +x =[3]5- ([2]5x? + [2]5x7) +x. 


www.elsevierdirect.com 


466 Chapter 12 


Dividing [2]5x? + [2]5x2 by x gives us 
[2]5x° + [2]5x* = ([2]5x” + [2]5x) -x+0. 


Since x is already monic, x = gcd Ge +x% 4x, [2]5x3+ [2]5x7). It is also easy to see 
that we can write 


x= [1]5- (x? +27 +x) +[2]5- ([2]5x° + [2]52°). 


As you probably expected, if a(x), b(x) € F[x] with gcd(a(x), b(x)) = 1, then we say that a(x) 
and b(x) are relatively prime. You should convince yourself that a(x) and b(x) are relatively 
prime if and only if there is no irreducible polynomial f(x) that is a divisor of both a(x) and 
b(x). It now follows that if p(x), g(x) € F[x] are irreducible, then either they are relatively 
prime or there is some nonzero a € F such that g(x) = ap(x). Thus, if p(x), g(x) are monic 
and irreducible, then either they are relatively prime or are equal. We can now return to the 
path that will lead to a proof of the second part of Theorem 12.4. 


Lemma 12.14. Jf a(x), b(x), f(x) are nonzero polynomials in F(x] such that 
f(x) | (ax) -b(x)) and gcd(a(x), f(x)) = 1, then f(x) | bx). 


Proof. Since a(x) and f(x) are relatively prime, we can write 1 as a multiple of a(x) plus a 
multiple of f(x). Therefore, there exist polynomials r(x) and s(x) such that 
1 =r(x)-a(x) +s(x)- f(x). 
Multiplying this equation by b(x) results in 
b(x) = D(x) (r(@) -a(x)) + B®) - (9(2)- f)) = 
r(x) + (a(x) - b(x)) + (BQ) + s(x)) + fF). 


Having written b(x) as a multiple of a(x) - b(x) plus a multiple of f(x), Lemma12.12 implies 
that f(x) | b(x). 


The next lemma is essentially Corollary 3.10 modified for F[x]. Note that it is now the monic, 
irreducible polynomials in F'[x] that assume the role played by prime numbers in 
Corollary 3.10. 


Lemma 12.15. Let p(x), q1(x), g2(X), .-., n(x) be monic, irreducible polynomials in F [x] 
(which are not necessarily distinct). If 


P(X) | (qi (%) + 92%) +++ dn), 
then p(x) is equal to one of the qi(x)’s. 


Polynomials over Arbitrary Fields 467 


Proof. We let T be the set of positive integers n such that whenever a monic, irreducible 
polynomial p(x) divides the product g(x) -go(x) +--+ n(x) of n monic, irreducible 
polynomials, then p(x) is equal to one of the g;(x)’s. We need to show that T = N, and we will 
proceed by using Mathematical Induction. First, we need to show that T contains 1. So let us 
consider the case where p(x) and qi (x) are monic, irreducible polynomials such that 

P(x) | qi (x). This immediately implies that p(x) = q(x), so 1 € T. 


Next, we consider the case where T contains some positive integer k. We need to show that T 
also contains k + 1. Therefore, suppose that we are now in the situation where 
D(X), Gi(X), g2(X), ---, G(X), Ge41(%) are monic, irreducible polynomials such that 


(9) P(X) | (qi) + g2(%) +++ gk) - gk41@)). 
We need to show that p(x) is equal to one of the g;(x)’s. 


There are two possibilities: either p(x) = qx41(x) or p(x) # gx41(x). In the first case, we are 
done. In the second case, let 


D(x) = qi(X) + q2(x) +++ GK(%), 


then (9) becomes p(x) | (b(x) - gx41(x)). However, in this case, p(x) and gx+1(x) are relatively 
prime. Therefore, we can apply Lemma 12.14 to assert that p(x) | b(x). But b(x) is a product 
of k monic, irreducible polynomials and T contains k. Therefore, p(x) is indeed equal to one 
of the g;(x)’s that appear in b(x), and we are also done in this case. 


We can now complete the proof of the main result of this chapter. 


Proof of the final part of Theorem 12.4—uniqueness of the factorization into irreducible 
polynomials. We will proceed using the Second Version of Mathematical Induction. Let T 
denote those natural numbers n such that every f(x) € F[x] of degree n can be written 
uniquely, up to order, as an element of F times monic, irreducible polynomials in F[x]. Our 
goal is to show that T = N. Mathematical Induction asserts that it will be enough for us to 
show that T contains | and that whenever T contains the set of numbers {1,2,..., k}, then it 
also contains the number k + 1. 


If f(x) €ax+ 6 € F[x], then certainly f(x) = a(x + ) is one way to express f(x) as an 
element of F times monic, irreducible polynomials in F[x]. If f(x) = ypi(x)--> pn(x) is 
another factorization of f(x), then since the leading coefficient of f(x) is a and each p;(x) is 
monic, we can see that a = y. Since each p;(x) has degree at least 1 and f(x) has degree 1, it 
is clear that n = 1. As a result, 


a(x ‘) ee oy 


468 Chapter 12 


which immediately tells us that p;(x) = x+ E Thus, our factorization of f(x) as 


fw= a(x + | is unique and T does contain 1. 


Now suppose that 7 contains the set of numbers {1,2,...,k}; we need to show that T contains 
k +1. Therefore, we may assume that every polynomial whose degree is at least 1 and is less 
than k+ 1 can be written uniquely, up to order as an element of F times monic, irreducible 
polynomials in F'[x]. Our job is to show that the same is true for every polynomial of degree 

k +1. Therefore, let us suppose that f(x) € F[x] has degree k + 1 and that 


f(x) = ap, (x) + p2(x)-+* pn(x) and f(x) = Bai(x)-g2(x)- ++ gm(x) 


are two ways of writing f(x) as an element of F times monic, irreducible polynomials in F[x]. 
To show that these two factorizations of f(x) are identical, up to order, we need to show that 
there is a reordering of the q;(x)’s such that a = B,n =m, and p;(x) = q;(x), for alli <n. 


Since p;(x) is a divisor of f(x), p1(x) must also be a divisor of q1 (x) - g2(X)--- Gm (x). By 
Lemma 12.15, p;(x) must be equal to g; (x), for some k < m. We can now reorder the g ;(x)’s 


such that p;(x) = qi(x). If we let g(x) = i, we now have 


(10) Q(x) = ap2(x)- p3(X)++* Pn(x) and g(x) = Bqo(x)-9q3(x) +++ dm(x). 


One possibility is that g(x) has degree 0. In this case, the previous equations involving g(x) 
reduce to g(x) =a and g(x) = f. Thus, a = f and it follows that f(x) = ap, (x) is the only 
way to write f(x) as an element of F times monic, irreducible polynomials in F[x]. 


The only other possibility is that g(x) has degree at least 1. In this case, deg(g(x)) € T, so g(x) 
can be written uniquely, up to order, as an element of F times monic, irreducible polynomials 
in F[x]. In light of this, it follows that we can reorder the gj(x)’s in (10) so that a = £, 
n—1=m-—1, and p;(x) = qi(x), whenever 2 < i < n. However, having already shown that 
Pi(x) = qi (a), it now follows immediately that a = 6, n = m, and p;(x) = q;(x), for alli <n. 
Thus, we have succeeded in showing that the factorization of f(x) is unique, up to order. 
Hence, k+1 € T, as desired. 


As we will see in the next example, the unique factorization of polynomials can be used to 
help us find minimum polynomials. 
Example 


Let us find the minimum polynomial for J2+-/7 over Q. If we let a = J2+ 7, then 
squaring both sides yields 


a =942V/14. 


Polynomials over Arbitrary Fields 469 


This implies that 
a? —9 =2/14 


and squaring again results in 
a — 180° + 81 = 56. 


Thus, we can see that at — 18a? + 25 = 0, and it follows that @ is a root of 
x = 18x" +25 € Olx). 


Whereas finding x* — 18x” +25 was fairly easy, we still need to show that it is irreducible, 
and that is more difficult. One’s first impulse is probably to try to use Eisenstein’s 
Criterion, but clearly it does not apply in this case. However, the Rational Root Test does 
give us some information. Since none of +1, +5, +25 are roots of x* =18x" +25, we 
know that our polynomial has no linear factors in Q[x]. Thus, if x*— 18x? +25 is not 
irreducible in Q[x], then it must be the product of two irreducible quadratics in Q[x]. If 
we were still in Chapter 9, we might try to finish this problem using Gauss’ Lemma and 
some brute-force computations. But we are now in a position to use Theorem 12.4. 


In light of our work in Chapter 5 on the relationship between automorphisms and roots 
of polynomials, your intuition probably tells you that +./2+ V7 are the four roots in C 
of x* — 18x? +25. To see this, let a = /2 —/7 and then perform the identical operations 
we had performed when we had let a = /2+ 4/7. In this case, we now obtain 

a? = 9—2,/14, a? —9 = —2,/14, at — 18a? + 81 = 56, and finally o* — 1807+ 25 = 0. 
Thus V2 — V7 is also a root of x* — 18x? +25. But since all the exponents of x in 

x — 18x? +25 are even, y is a root of x* — 18x? +25 ifand only if —y is a root. Thus, all 
four of +/2+ V7 are roots of x* — 18x? +25. 


As a result, up to order, the only way to factor x4 — 18x? +25 in C[x] into monic, 
irreducible polynomials is as 


x4 — 18x?+25 = (x—(V2+V7))-(x-(v2-v7))- 


(11) (x-(-v2+¥7))-(x-(-v2-v7)). 


Now, suppose x* — 18x? + 25 can indeed be factored into two monic, irreducible 
quadratics in Q[x]. Then the uniqueness portion of Theorem 12.4 asserts that when we 
factor these two quadratics in C[x], we must obtain the same four linear factors as in 
(11). Thus, one of these irreducible quadratics must be a multiple of x — (/2+ V7) in 
C[x]. Hence, the product of x — (/2 +/7), and at least one of the other three linear 


www.elsevierdirect.com 


470 Chapter 12 


factors of x* — 18x? +25 in C[x] must belong to Q[x]. However, we can easily check that 
(x- (v24 v7)) . (x- (v2-v7)) — 2 —2/2x—5 ¢ Ola, 
(x- (v2+ v7) (x- (-v2+ v7)) = 2_2W7x45 ¢Q[x], 
(«—(Vi+v7))-(x—(-vi-v7)) =.2- (94 2V74) ¢ ota, 


Thus, x* — 18x7 +25 has no quadratic factors in Q[x], so it is irreducible in Q[x]. 
Therefore, Lemma 12.9 tells us that x4 — 18x? + 25 is indeed the minimum polynomial for 


J/2+-7 over Q. 


Exercises for Sections 12.3 and 12.4 
For exercises 1—9, in Q[x], let 


fl) = 10x? (x2 +1)” (x? +44 10)° (x3 -7)° (x4 +5)” (x3 + 3x6), 
g(x) = —4(x-+3)? (x2 +1)? (x?-+4x+10)* (x3 -7)° (x3 +3x—6)" (x°+ 10x — 15)’, 
h(x) = 8x3 (x +3) (x2 +4x+ 10)” (x3 —7)° (x4 +5)” (x2 —8)* (x6 + 10x5 — 15)”. 
In exercises 1-4, you can use unique factorization instead of the Euclidean Algorithm to 
compute greatest common divisors. 

1. Find gcd( f(x), g(x)). 

2. Find gcd(g(x), h(x)). 

3. Find gcd( f(x), h(x)). 

4. Having defined the greatest common divisor of two polynomials, we can easily extend 
this definition to three polynomials as gcd(a(x), b(x), c(x)) = gcd(gcd(a(x), b(x)), c(x)). 
Using this definition, find gcd( f(x), g(x), h(x)). 

For exercises 5—9, please also read the following: 
In order to add rational functions, such as 7 Te at a Ha 2 7» We usually begin by 
computing the least common multiple of the denominators. More formally, given nonzero 


polynomials aj (x), a2(x), ..., a,(x), we can let the least common multiple, denoted as 
lcm(a, (x), d2(X),..., An (x)), be the monic polynomial of smallest degree that is a multiple of 
every a;(x). 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 471 
5. Ifa is the product of the leading coefficients of nonzero polynomials a(x) and b(x), show 
that Jcm(a(x), b(x)) = pine. 
Find Icm( f(x), g(x)). 
Find Icm(g(x), h(x)). 
Find Icm( f(x), h(x)). 
Find Icm( f(x), g(x), h(x)). 


S07 Oe TD 


For exercises 10-17, you may first need to read the instructions for exercises 1—9. In Zs[x], let 


J) = [Bls(x + [21s)4 (x + (41s)? (x? + 21s) (x? + 131s)” (x2 +x+ (2s), 
k(x) = (x + (115) x + (215)? (x2 + Bs)” (x? +x+ 1s) (x? +24 [21s)°, 


I(x) = [4]5x3 (x + 4s) (x2 + Bs)” (x? ++ (21s). 


10. Find gcd(j(x), k(x)). 
11. Find ged(k(x), I(x)). 
12. Find gcd(j(x), U(x). 
13. Find gcd(j(x), k(x), U(x). 
14. Find Iem(j(x), k(x)). 
15. Find Icm(k(x), I(x). 
16. Find Icm(j(x), I(x). 
17. Find Iem(j(x), k(x), U(x). 


18. If a(x), b(x) are nonzero polynomials and c(x) = gcd(a(x), b(x)), show that 


gcd (@ H ) = 1, 


eG)? c@) 
19. Suppose a(x), b(x) are nonzero polynomials such that a(x) + b(x) are also nonzero. If 
c(x) = gcd(a(x), b(x)), find gcd(a(x) + b(x), a(x) — b(x)). 
20. If gcd( f(x), g(x)) = 1 and m,n €N, show that gcd( f(x)", g(x)") = 1. 
21. Prove or provide a counterexample to the following statement: if c(x) = gcd( f(x), g(x)), 


then c(x)* = gcd( f(x)’, g(x)’). 


22. Prove or provide a counterexample to the following statement: if c(x) = gcd( f(x), g(x)) 
and n EN, then c(x)” = gcd( f(x)", g(x)”). 


472 Chapter 12 


23. Prove or provide a counterexample to the following statement: if c(x) = gcd( f(x), g(x)), 
then c(x)* = gcd (Fs g(x)°). 

24. Prove or provide a counterexample to the following statement: if c(x) = gcd( f(x), g(x)), 
then c(x)* = gcd (Fa), g(x)°). 


In exercises 25-32, find the minimum polynomial over Q of the given number. Recall, it is 
necessary to show that the polynomial you find is irreducible in Q[x]. 


De a/T 

26. 3(57) 

27. 5—6i 

28. —114+4/13 
29. /34+/5 
30. 5/6+4+2i 
31. 3/17 - 2/37 
32. i(195) 


In exercises 33-38, find the minimum polynomial over Q(/2) = {a+ bV2 | a, b € Q} of the 
given number. Recall that it is necessary to show that the polynomial you find is irreducible in 


Q(v2) [x]. 


33. 19 
34. —5+2./2 
35. J24+V73 

36. 7—4i 

37. J3+/5 

38. 5/2 -TiV3 


In exercises 39-44, find the minimum polynomial over Q(i) = {a+ bi | a, b € Q} of the given 
number. Recall that it is necessary to show that the polynomial you find is irreducible in 


Q@[x]. 


39. /3 
40. 5—4i 
AL CALS 


42. iv6 


43. 
44. 
45. 


46. 


47. 


48. 


Polynomials over Arbitrary Fields 


V3 475 
i/2 
(a) Find a(x), b(x) € Q[x] such that a(x) - (2x° + 6x4 + 2x3 +. 5x? + 15x +5) 


+b(x) - (3x3 + 2x? — 18x —7) = gcd(2x° + 6x4 +2x3 +. 5x7 + 15x+5, 
Bx 42x? = 18x = 7), 


(b) Find c(x), d(x) € Q[x] such that c(x) - (2x° + 6x4 + 2x3 + 5x? + 15x+ 
5) dG) (Bi 22x = 184 = 7) S 5x S15 Sa, 
(c) Do there exist e(x), f(x) € Q[x] such that e(x) - (2x° + 6x4 +2x3+ 
5x? + 15x +5) + f(x). (3x3 + 2x? — 18x —7) = 7x7 — 11? 
(d) Describe all m(x) € Q[x] such that there exist g(x), h(x) € Q[x] such that 
g(x)» (2x° + 6x4 + 2x3 + 5x7 + 15x +5) +h(x)- (3x3 + 2x? — 18x —7) = m(x). 
(a) Find a(x), b(x) € Q[x] such that a(x) - (2x? — 6x +1) + b(x)- (x?+ 
5x —7) = gcd(2x° — 6x +1, x7 +5x—7). 


(b) Find c(x), d(x) € Q[x] such that c(x) - (2x? — 6x +1) +.d(x) + (x?7+ 
5x —7) =3x-15. 


473 


(c) Find s(x), t(x) € R[x] such that s(x) - (2x7 —6x+1)+1(x)- (x?+ 5x—7) =ax+e. 


(d) Find v(x), w(x) € C[x] such that v(x) - (2x3 — 6x +1) + w(x): (x? + 
5x—7) = (104+21i)x+ (10+ 25i). 

(a) Find a(x), b(x) € Zs[x] such that a(x) - (x? +x? +x+[1]s) + d(x): 
(x7 +.x+[3]s) = gcd(x? +x7 +x4+[1]5, x7 +x-+ [3]s). 

(b) Find c(x), d(x) € Zs[x] such that c(x) - (x? +x? +.x+[1]5) + d(x): 
(x? +x4+ [3]5) = [4]5x + [3]s. 

(c) Do there exist e(x), f(x) € Zs[x] such that e(x) - (x? +27 +x+[1]5) + f(x): 
(x? ++x+[3]s) =x+[1]s5? 

(d) Describe all m(x) € Zs[x] such that there exist g(x), h(x) € Zs5[x] such that 
g(x): (x +x? +x4+[1]5) HA): (x? +4 [3]5) = ma). 

(a) Find a(x), b(x) € Z3[x] such that a(x) - (x? + [2]3x) + b(x)- 0? + [2]3) = 
gcd(x? + [2]3x, xeo+ [2]3). 

(b) Find c(x), d(x) € Z3[x] such that c(x) - (x? + [2]3x) + d(x) - (x3 + [2]3) = 
[2]3x7 +x. 

(c) Do there exist e(x), f(x) € Z3[x] such that e(x) - (x? + [2]3x) + f(x): 
(x3 + [2]3) =x? +x? 


474 Chapter 12 


(d) Describe all m(x) € Z3[x] such that there exist g(x), h(x) € Zs[x] such that 
g(x) (x3 + [2]3x) + A(x)? + [2]3) = ma). 


49. (a) Find a(x), b(x) € R[x] such that a(x) - (x? +1) +b(x)-(x+1) = 1. 


(b) Find c(x), d(x) € R[x] such that c(x) - (x? +1) + d(x) - (x + 1) = 5x? —3x—2. 
(c) Adapt your answer from part (b) to find a real number @ and some g(x)ER[x] of 
degree at most 1 such that w- (x? +1) + g(x)- («+ 1) = 5x? —3x—2. 


g(x) _ _5x*—3x-2 
+1 @+DO?P+)* 


(d) Show that a and g(x) satisfy the equation aa =F 


(e) Incalculus, a typical integration problem you might come across is Ao dx. 


Explain why your answer from part (d) would be useful in solving this problem. In 
light of this, hopefully you can begin to see that the Euclidean Algorithm and 
Greatest Common Divisors in R[x] play a fundamental role in justifying the use of 
partial fraction decompositions in calculus. 


12.5 Formal Derivatives and Multiple Roots 


In light of Theorem 12.4, if f(x) € F[x], then we can always write f(x) uniquely, up to 
order, as 


(12) F(x) = a(x — 08)"! (& — 8g)" «+ + (X — cy)" 91 (XY 2 (x)? +» B5(X)™, 
where a € F, a1,..., @; are distinct elements of F, and the g;(x)’s are distinct monic, 
irreducible polynomials in F[x] of degree at least 2. Observe that a1, a2, ..., a; are the only 


roots of f(x) in F. In this situation, we say that a; is a root of f(x) of multiplicity 7;. 


m Examples 
1. Let 
f(x) = 23 (x —5)7(x + 3)4(x+ 10) (i +2x+2) (x = ay € Q[x]. 


Since our polynomial is completely factored in Q[x], we can say that 0 is a root of 
multiplicity 3, 5 is a root of multiplicity 2, —3 is a root of multiplicity 4, and —10 isa 
root of multiplicity 1. 


If we look at this same polynomial in R[x], it is not as yet completely factored since 
x* —2 is reducible over R. The complete factorization of f(x) in R[x] is 


fe) = 3 (@— 5)? +3) (w+ 10) (x — V2)? (x + V2)? (x? +2 +2). 


Polynomials over Arbitrary Fields 475 


In IR, we now say that 0 is a root of multiplicity 3, 5 is a root of multiplicity 2, —3 is a 
root of multiplicity 4, —10 is a root of multiplicity 1, 2 is a root of multiplicity 3, 
and —J/2 is a root of multiplicity 3. 


If we move to C[x], then f(x) is still not completely factored as x7 4-242 is 
reducible over C. In fact, the complete factorization of f(x) in CLx] is 


FO) = (x —5)2(x +3) (x + 10)(x — V2)? (x +-V2)? x — (-14-)(@—(-1-9). 


Thus, in C, we say that 0 is a root of multiplicity 3, 5 is a root of multiplicity 2, —3 is 
a root of multiplicity 4, —10 is a root of multiplicity 1, V2 is a root of multiplicity 3, 
—/2 is a root of multiplicity 3, —1+7 is a root of multiplicity 1, and —1—i is a root 
of multiplicity 1. 


2. Consider g(x) = x°7+[1]3 in Zy[3]. Since (x +[1]3)? = x7 +[1]3 in Z.[3], the 
complete factorization of g(x) is 


g(x) = («+ [1h)? = @—-[2]})*. 


Thus, [2]3 is a root of multiplicity 3. 


Depending upon the situation, if f(x) € F[x] is factored as in (12), then there are two ways to 
count the number of roots of f(x) in F. The simplest way is merely to say that a, a@,..., @ 
are the roots of f(x), so f(x) has t roots. However, in some other cases, we count a root as 
many times as its multiplicity. For example, given g(x) = (x — 1)°(x+3)8 € Qf], we 
consider 1 as counting as five roots of g(x) and —3 as counting as eight roots of g(x). If we 
count roots in this fashion and if f(x) is factored as in (12), then the number of roots of f(x) in 
F isny+n2+---+n,. When counting roots in this manner, we say that we are counting roots 
including multiplicities. Observe that when f(x) is factored as in (12), it is certainly the case 
that deg(f(x)) > n, +no+---+n;. As a result, we have actually already proven 


Corollary 12.16. If f(x) € F[x] has degree n > 1, then the number of roots in F, including 
multiplicities, is at most n. 


There are many ways to state the Fundamental Theorem of Algebra. Our Theorem 6.12 is one 
version but, in other books, you may come across different versions of this wonderful result. 
All of these versions follow easily from Theorem 6.12 and the work in the chapter. For 
convenience, we list them all in one place as 


www.elsevierdirect.com 


476 Chapter 12 


Theorem 12.17—Versions of the Fundamental Theorem of Algebra 
1. Every f(x) € C[x] of degree n > 1 has a root in C. 
2. Every f(x) € R[x] of degree n > 1 has a root in C. 


3. Every f(x) € C[x] of degree n > 1 can be written in the form 
f(x) = a(x — a1) (x — 2) +++ (X-an), 


where a, Q1,..., Qn € C. 


4. Every f(x) € R[x] of degree n > 1 can be written as a product of linear polynomials and 
irreducible quadratic polynomials in R[x]. 


5. Every f(x) € R[x] of degree n > 3 is reducible in R. 


Proof. Part (1) is Theorem 6.12 and, since R[x] C C[x], part (2) is merely a special case of 
Theorem 6.12. Recall that Theorem 6.12 implies that the only irreducible polynomials in C[x] 
have degree 1. This fact, along with Theorem 12.4, implies part (3). Part (4) is Corollary 12.8 
and part (5) follows immediately from Corollary 12.8. 


If f(x), g(x) are already in completely factored form in Fx], then it is easy to find 

gcd( f(x), g(x)). Similar to the case for greatest common divisors in Z, the exponent of a 
monic, irreducible polynomial p(x) in gcd( f(x), g(x)) is the smaller of the exponents to which 
P(x) appears in the factorizations of f(x) and g(x). For example, if 


f(x) = 6x7 (x +7)4 (x? —3)9 (x? +1)? and g(x) = 10x(x +7)8(x? — 3), 


then gcd( f(x), g(x)) = x(x +7)4 (ax? — 3). 


We have seen many examples where f(x) € F[x], F C K are fields, and the complete 
factorization of f(x) is very different in F[x] than it is in Kx]. In light of this, it is natural to 
wonder if the greatest common divisor of two polynomials f(x), g(x) € F[x] changes when we 
move to a larger field K. The next result shows that although the factorization of f(x) and g(x) 
can change when we move to a larger field, the greatest common divisor remains the same. 


Lemma 12.18. Jf F C K are fields and f(x), g(x) € F[x], then the greatest common divisor 
of f(x) and g(x) in F[x] is the same as it is in K[x]. 


Proof. Let c(x) = gcd( f(x), g(x)) in F[x] and d(x) = gcd( f(x), g(x)) in K[x]; we need to 
show that c(x) = d(x). Since F C K, it follows that in K[x] both c(x) and d(x) can be written 
as a multiple of f(x) plus a multiple of g(x). In addition, since c(x) and d(x) both divide f(x) 
and g(x) in K[x], Lemma 12.12 implies that c(x) | d(x) and d(x) | c(x) in K[x]. Thus 


Polynomials over Arbitrary Fields 477 


c(x) = yd(x), for some y € K. But since c(x) and d(x) are both monic, we immediately see 
that y = 1 and c(x) = d(x), as desired. 


We will conclude this chapter by dealing with a problem we mentioned in Chapter 1. A root of 
a polynomial is called a multiple root if its multiplicity is greater than one. In calculus, you 
probably noticed that the multiple roots of a polynomial f(x) € R[x] were precisely the roots 
of f(x) that were also roots of its derivative, f’(x). As we shall see, this fact follows from 

an algorithm that tells us when a polynomial f(x) € F[x] has multiple roots in a field K 
containing F’. 


It is quite surprising that there is a simple algorithm for determining whether a polynomial has 
multiple roots. As we have seen, it can be extremely difficult to find the roots of a polynomial. 
In fact, Galois has shown that using only basic algebraic operations, there are many 
polynomials where it is impossible to find the roots. Thus, it is indeed a surprise that there is 
an algorithm, that essentially consists of computing a greatest common divisor, that will tell us 
if a polynomial has multiple roots. 


Since we will be dealing with fields more general than R and C, we will need to look at 
derivatives in a formal, algebraic way as opposed to as the limit of difference quotients. This 
leads us to 


Definition 12.19. If f(x) = anx" + dy—1x"~! +++ Fagx*? +.a,x +a € F[x], then the formal 
derivative f'(x) of f(x) is defined as 


f(x) = nanx" | + (n= Vay1x 7? + + 2anx tay. 
Although the formal derivative of a polynomial looks like the derivative from calculus, we 
need to be careful about some of the notation in Definition 12.19. For example, if 


g(x) = ayx"" € F[x], then 9’ (x) = ma,x""—!. However, whereas a € F, m is not necessarily an 
element of F’, we need to clarify what we mean by the product ma,,. 


Ifa € F and n is a positive integer, the term na will be a shorthand for the sum 
at+da+-:--+a 


of n copies of a. As a result, it is certainly the case that na € F. It is now easy to see that if 
f(x) € Fx], then f’(x) € F[x]. 


For example, consider g(x) = [2] 5x° +[3]5 € Zs[x]. Since the sum of five copies of [2]5 is 
equal to [0]5 in Zs, it follows that g’(x) = [0]; despite the fact that g(x) has degree 5. This 
example indicates that fields split into two very different classes. 


The more familiar situation for us is when a field F has the property that, for every n € N, the 
sum of n copies of the multiplicative identity is not equal to the additive identity. This is 


478 Chapter 12 


certainly the case for Q, R, and C. On the other hand, fields like Z2, Z3, Zs, and Zp, for any 
prime p, have the property that a sum of copies of the multiplicative identity does equal the 
additive identity. In this situation, let n be the smallest positive integer where this occurs. 
Observe that in our examples above, n is always a prime. This is not a coincidence. 


Suppose n = a-b, where 1 < a,b <n. If we let A and B denote, respectively, the sum of a 
copies and b copies of the multiplicative identity, then the distributive law tells us that A - B is 
the sum of 7 copies of the multiplicative identity. Thus, A - B is equal to the additive identity. 
However, since we are working in a field, either A or B must be equal to the additive identity. 
By the minimality of n, either a=n or b=n:. As a result, n cannot be written as a product of 
smaller positive integers, so n is indeed prime. We can summarize these observations in 


Definition 12.20. Let F be a field. If a sum of copies of the multiplicative identity of F equals 
the additive identity, let p be the smallest number of copies where this occurs. In this case, p is 
a prime number and we say that F has characteristic p. On the other hand, if no sum of copies 
of the multiplicative identity equals the additive identity, we say that F has characteristic 0. 


At the moment, it is not clear why the formal derivative is useful to us. However, it turns out 
that the formal derivative satisfies the familiar product rule and that is precisely what we need 
to obtain our result on multiple roots. Since the formal derivative is defined algebraically, and 
not in terms of limits, the proof of the product rule will be purely algebraic. 


Lemma 12.21. Jf f(x), g(x) € F[x], then 


(F(x) - g@x))' = fx) 8x) + f(@)- 8). 


Proof. Suppose f(x) = anx" Aig. 4g ae hogan x aie ag-and 
8(x) = Dy x™ + Dinix"! Sa + box? +b,x+ bo, then 


f() = (nan) x8 + = V) Gn)? + + 2ag)x + a4 
and 
g(x) = (mB) x"! +m = 1) (Bm) 0"? + + (22) x + 1. 
Therefore, if t > 1, then the distributive law tells us that the coefficient of x‘~! in f’(x) - g(x) is 
(ta;)(bo) + (t — 1) (ay-1) (b1) ++ + + + (22) (Dy-2) + (a1) (B:-1) 
and the coefficient of x’~! in f(x) - g’(x) is 


(4,1) (b1) + (Qr-2) (22) +++ + + (a2) (t — 2) (By-2) + (ar) (t— 1) (1) + (0) OCn)- 


Polynomials over Arbitrary Fields 479 


Adding these two expressions, we can see that the coefficient of t — 1 in 


f'@) + 8(x) + fx) - 8’(x) is 
t(d; + bo) + t(ay—1 -b1) + t(a;—2 2) ++ ++ + (a2 - by—2) + (ay - By-1) + tao - By). 
On the other hand, the coefficient of x’ in f(x) - g(x) is 
at bo +ay_-1 by +ay—2- bg +--+ +a2-bi-2 +a, - by) +g: by. 
Therefore, the coefficient of x‘~! in (f(x) - g(x))’ is 
t(a;- bo +ay—-1 +b, +aQ;-2-ba +--+ +42-by-2 +a, - Dy) +.a9-D;). 


As we can see, for every t > 1, the coefficient of x’! in f’(x)- g(x) + f(x) - 9/(x) and 
(f(x) - g(x))’ is the same. Hence, (f(x) - g(x))’ = f’(x)- g(x) + f(x) - g’ (x), as desired. 


We can now prove the main result of this section. 


Proposition 12.22. If f(x) € F[x], then the multiple roots of f(x) in a field K containing F 
are precisely the roots in K of gcd( f(x), f'(x)). In particular, if gcd( f(x), f’(x)) = 1 in F[x], 
then f(x) does not have multiple roots in any field containing F. 


Before proving Proposition 12.22, we will examine some examples. 


m Examples 


1. Let f(x) =x? +4x7+5x+2 € Qa]; then f’(x) = 3x7 +8x-+5 and we need to 
compute gcd(x* + 4x” +5x +2, 3x? + 8x +5). We now apply the Euclidean 
Algorithm. First, we divide x° + 4x? + 5x +2 by 3x7 +8x+5 and obtain 


44x? 4+5x42= (5x+5) (3x?-+8x+5)+(-52-5) 


2 
9? 


3x?48x+5=( ans =).( sx 5) +0. 


Our last nonzero remainder is —§x _ 5, but it is not monic. However, if we multiply it 
by —3, we obtainx+1. Thus, x+1 = gcd(x? +4x?+4+5x+2,3x7+8x+5). Since the 
only root of x +1 is —1, Proposition 12.22 asserts that —1 is the only multiple root 


of x3 +4x? +5x+2 in any field containing Q. 


Next we divide 3x7 + 8x +5 by —x— and we have 


www.elsevierdirect.com 


480 Chapter 12 


2. Let g(x) =x° +5x4+8x3 + 40x? + 16x + 80 € Q[x]; then g/(x) = 5x4 4+20x3 + 
24x? + 80x + 16 and we need to find gcd(x° + 5x4 + 8x3 +40x2 + 16x+ 80, 5x*+ 
20x? + 24x” + 80x + 16). First, we divide x° +5x* + 8x3 + 40x? + 16x + 80 by 
5x* + 20x? + 24x? + 80x + 16 and obtain 


x° + 5x44 8x3 + 40x? + 16x +80 = 


ae ee ee ee 45,96 16, 384 
(<x+5)- (5x44 20x) + 24x? + 80x +16) + (— Sx ++ Sx? — Sx =) 


Next, divide 5x4 + 20x? + 24x? + 80x + 16 by —3x3 + 8x? — xt et and we have 


5x* + 20x? + 24x” + 80x + 16 = 
25 4, 9%, 16 384 ‘i 
(-x-175)-(- 28+ 22-4) + (33647 + 13456). 
Next, divide —3x3 “+ 98 x? ae ox “s by 3364x + 13456 and we have 
4, 9%, 16 384 1 24 5 
~ax3 + x? — =x 4 — = ( — ——x4+ —_). (3364x? + 13456) +0. 
ug 5 ( ee) (een 


Our last nonzero remainder is 3364x? + 13456, but it is not monic. If we multiply it 
by ct: we obtain x* +4. Thus, 


x? +4 = gcd (x? +5x* +8x° +. 40x? + 16x +80, 5x4 + 20x? + 24x? + 80x+ 16). 


The polynomial x? +4 has no roots in R but does have roots +2i in C. As a result, 
Proposition 12.22 tells us that x° +5x* + 8x3 +40x? + 16x + 80 has no multiple 
roots in R but does have two multiple roots in C. 


In fact, we can say quite a bit more about x° + 5x++ 8x? +40x? + 16x + 80. When we 
look at the complete factorization of x9 + 5x4 + 8x3 + 40x? + 16x + 80 in C[x] the 
exponents of both x — 2i and x + 2i must be at least 2. Thus, 


x5 45x44 8x3 +. 40x? + 16x +80 = (x — 2i)?(x + 2i)29(x) = (x2 +4)” s(x), 


for some s(x) € R[x]. By looking at the leading and trailing terms of 
x° + 5x44 8x? + 40x? + 16x + 80 and (x? +4)? = x*4 8x7 + 16, it is easy to see that 
s(x) =x+5. Thus, 


x5 4+ 5x4 +833 +.40x? + 16x +80 = (x2 +4) (x +5) 
is the complete factorization of x° + 5x4 + 8x3 + 40x? + 16x +80 in R[x]. 


3. Leta bea nonzero real number. It follows from Theorem 6.8 that the nth roots of a 


in C are all different. We can also illustrate this point using Proposition 12.22. If we 


www-.elsevierdirect.com 


Polynomials over Arbitrary Fields 481 


let A(x) = x" —a@ € C[x], then h’(x) =nx""!. It is easy to see that h(x) and h’(x) are 
relatively prime, so x” —a@ has no multiple roots in C. 


We can now prove Proposition 12.22. The proof will incorporate and integrate many of the 
ideas from this chapter. 


Proof of Proposition 12.22. In one direction, suppose K is a field containing F anda € K isa 
root of gcd( f(x), f’(x)). Corollary 12.10 asserts that x — a@ is a factor of both f(x) and f’(x) in 
K[x] and we now have 


(13) f(x) = (x— @)- p(x), 


for some p(x) € K[x]. Taking the formal derivative and applying the product rule to the 
previous equation gives us 


f'(@) = @— a)! p(x) + a) p(x) = p(x) + (~—@)- p'). 


Since x — a is also a factor of f’(x), replacing x by @ in the previous equation tells us that 
p(a) = 0. Using Corollary 12.10, we can see that x — @ is also a factor of p(x) in K[x]. 
Therefore, p(x) = (x —a@)- q(x), for some q(x) € K[x]. Plugging this into equation (13) 
results in 


f(x) = (&—@)- p(x) = («— a) - ((x-@) g(x) = (x)? + g(x). 
Therefore, @ is a multiple root of f(x) in K. 


In the other direction, suppose K is a field containing F anda € K is a multiple root of f(x). 
Therefore, we now have 


(14) fx) = (@—a)*- g(a), 
where q(x) € K[x]. Since 
(x- a)? =x’ —2ax+a’, 
it follows that 
((x—a)?)’ = 2x? — 2ax = 2(x- a). 
Applying the product rule to equation (14) and substituting the previous equation, we obtain 


f'@) = (—-2)”)'-q@) + @—@)? -¢() = 20-2) - g(x) + 2)? -g'(). 


482 Chapter 12 


Replacing x by a@ in the previous equation shows that f’(a@) = 0. Corollary 12.10 asserts that 
x—a isa factor of f’(x), so it is acommon factor of f(x) and f’(x) in K[x]. By 

Theorem 12.13(b), x —a is a factor of gcd( f(x), f’(x)), so a is a root of gcd( f(x), f’(x)). 
Observe that although we are doing computations in K and K[x], Lemma 12.18 already told 
us that the greatest common divisor of f(x) and f’(x) remains the same regardless of whether 
we are working in F[x] or K[x]. 


Finally, if gcd( f(x), f’(x)) = 1, then gcd( f(x), f’(x)) has no roots in any field K containing 
F.. Hence, in this situation, f(x) does not have multiple roots in any field containing F’. 


If a field F has characteristic 0 and a is a nonzero element of F then, for any n € N, we have 
na=ata+t+::--+a=1-a4+l-at+-:--4+l-a=(14+1+---4+]1)-a, 


which is a product of two nonzero elements of F. As a result, na ~ 0 and we will make use of 
this fact in the following application of Proposition 12.22. 


Corollary 12.23. If f(x) € F[x] is irreducible in F(x], where F has characteristic 0, then 
f(x) has no multiple roots in any field K containing F. 


Proof. If the leading term of f(x) is a,x", then the leading term of f’(x) is na,x"~!. Since F 
has characteristic 0, na, 4 0, hence f’(x) is a polynomial of degree n — 1. 


Since f(x) is irreducible, the only monic polynomial of degree less than n than divides f(x) is 
the constant polynomial 1. If c(x) = gcd( f(x), f’(x)), then c(x) is a monic polynomial that 
divides both f(x) and f’(x). Since c(x) divides f’(x), the degree of c(x) cannot exceed n — 1. 
In particular, c(x) has smaller degree than f(x). But since c(x) also divides f(x), it follows that 
c(x) = 1. Hence, gcd( f(x), f’(x)) = 1, and Proposition 12.22 tells us that f(x) does not have 
multiple roots in any field containing F. 


Specializing Corollary 12.23 to the more familiar situation of polynomials with rational 
coefficients, we immediately have 


Corollary 12.24. Jf f(x) € Q[x] is irreducible in Q{x], then f(x) has no multiple roots in C. 


Observe that although Galois’ work tells us that we cannot find the roots of x° — 6x + 3, 
x° —4x + 2, and x° — 8x +6 using our standard algebraic tools, Corollary 12.24 immediately 
tells us that none of these polynomials has a multiple root in C. 


Whereas Corollary 12.24 is quite concrete, Corollary 12.23 looks more abstract. In particular, 
you might wonder if the hypothesis that F have characteristic 0 is needed. We will now show 
that it is. 


Polynomials over Arbitrary Fields 483 
m Example 


Let R= Z)[t’] be the ring of polynomials with coefficients in Zz using the variable i. 
Therefore, R consists of those polynomials where the exponent of t is always even. 


In calculus, you dealt with the field R(x) of rational functions. It consisted of those 
functions that could be written as a quotient of polynomials in R[x]. The construction of 
IR(x) from R[x] is done using equivalence relations and is almost identical to the 
construction of Q from Z. 


In the same way, we can construct the field F = Z(t?) from R = Z,[t?]. Examples of 
elements of F are 


[1]o0? [Vort+(1Jo = [1]t°+[1]or4 
[Hot4+f1]}o° [1Jor®+[1]o04’ [1] 


Observe that ifa € F, then a can be represented in many different ways in the form 


op where a(t), b(t) € Zo[t7]. Note that deg(a(t)) — deg(b(t)) must be even. If we now 


consider a, then a* = aa and deg(a(t)”) — deg(b(t)”) must be a multiple of 4. This tells 


us that ifa@ € F, then a cannot possibly to equal to 7”. 
; Pp y q 


a= 


The upshot of all this is that the polynomial x? — 1? € F[x] does not have any roots in F. 
Since x* — 1? has degree 2, this tells us that x? —?? is irreducible in F[x]. Now, let 
K = Z(t) be the field consisting of all quotients of polynomials from Zp[t]. It is easy to 
see that K isa field that contains F.. Furthermore, if we work in K[x], we can factor 
a. 72 
x“ —t* as 
; 


vy—Pp= (x-— i. 


Thus, x? —f? has multiple roots in K despite the fact that it is irreducible in F. 
, Pp P 


This example can be modified to the fields F = Z(t?) and K = Zp(t), where p is a prime. In 
this case, K contains F and both have characteristic p. The polynomial x? —t? € F[x] is 
irreducible in F[x], yet it factors over K[x] as 


xP —tP = (x—2)?. 


In the exercises, we will verify that x? — t? is indeed irreducible in F[x]. Thus, we will have 
shown that, for every prime p, there exists a field F of characteristic p and a polynomial 

f(x) € F[x] such that f(x) is irreducible in F'[x] and has multiple roots in a field K containing 
F.. This further illustrates the need for our field to have characteristic 0 in Corollary 12.23. 


www.elsevierdirect.com 


484 Chapter 12 


Exercises for Section 12.5 


In exercises 1-14, find all multiple roots in C of the given polynomial. 


1. 


a a a 
oo ND YN FY NY FS 


19. 


SO 200% TE CON Ss EB SG INS) 


x3 43x? -—9x+5 

x7 +3x? —9x+6 

x3 +6x? +15x+12 

x3 + 7x" + 16x + 12 

ao 2a? eB xt 147 

x4 +2x3 + 8x7 + 14x +6 

x + 2x4 — 6x3 — 12x7 +9x +18 

a 2x 26 = 129 8e 18 

6x71 

xt+x7-6 

x? — 3x4 + 10x3 — 30x? + 25x —75 

x — 3x4 + 9x3 — 30x? + 24x —75 

x4 — 3x9 43x? + 12x — 12 

x4—4x3 +x? + 12x -12 

Does x? +x € Z»[x] have any roots in Z that are also roots of its derivative? 
Does x* +x € Z3[x] have any roots in Z3 which are also roots of its derivative? 
Does x? + [2]3x? +x € Z3[x] have any roots in Z3 which are also roots of its derivative? 


Let F be a field of characteristic p 4 0. Suppose f(x) € F[x] is irreducible in F[x] and 
has degree n, where n is not a multiple of p. Show that f(x) does not have multiple roots 
in any field containing F. 


Let F C K be fields of characteristic p 4 0 and suppose a € K such that a ¢ F and 

aeF. 

(a) Show that if 7 is a positive integer less than p, then the polynomial (x — a)” belongs 
to K[x] but does not belong to F[x]. 


(b) Completely factor x? — a? in K[x]. 


(c) If b(x) € K[x] is a monic divisor of x? — a? in K[x], show that b(x) = (x — a)”, for 
some integer n, where 0 <n < p. 


20. 


21. 


Polynomials over Arbitrary Fields 485 


(d) Show that x? —a? is irreducible in F[x]. 
(e) Conclude that x? — a? is an irreducible polynomial in F[x] with multiple roots in K. 


Let F be a field with a finite number of elements and let p denote the characteristic of F. 
(Observe that F cannot have characteristic 0.) Show that the function o : F — F defined 
as o(r) = r?, for all r € F, is an automorphism of the field F. 


Give an example of a field F of characteristic p ~4 0 such that the functiono : F > F 
defined as o(r) = r?, for all r € F, is not an automorphism of the field F because it fails 
to be surjective. 


This page intentionally left blank 


Difference Functions and Partial 
Fractions 


In this chapter, we make repeated use of Mathematical Induction and apply several ideas from 
Chapter 12 to address some problems that arise in courses in precalculus and calculus. 


In precalculus courses, we are taught to recognize linear functions by applying the following 
test: A function is linear if and only if a fixed change in the variable always results in a fixed 
changed in the function. 


Let us apply this test to the following table: 


x: —7 —3 1 5 9 13 17 21 25 
F(x): 30 23 16 9 2 —5 —12 —19 —26 


Observe that throughout this table, as the variable continues to increase by 4, the function 
continues to decrease by 7. Thus, this function passes the test, and it is then easy to determine 
that this table is produced by the linear function — ix + 4. 


On the other hand, let us now apply this test to the following table: 


x: —9 —7 =5 —3 —1 1 3 5 
g(x): —51 —16 1 6 5 4 9 26 61 


Observe that whereas the variable continues to increase by 2 throughout the table, the 
change in the variable is not fixed. Thus, this function fails the test, and this table cannot be 
produced by a linear function. In this chapter, we will present a new test that is easy to 
apply that will allow us to determine when a table of values is produced by a polynomial. 
Then, if the table is indeed produced by a polynomial, we will find the smallest-degree 
polynomial that can produce the table. By applying this new test, we will show that the 
preceding table for g(x) is produced by a cubic polynomial, and we will find that 
polynomial. 


Next, we will revisit the types of formulas you were often asked to verify after first being 
introduced to Mathematical Induction. Typically, students are asked to use Mathematical 


Copyright © 2010 by Elsevier Inc. All rights reserved. 487 


488 Chapter 13 


Induction to verify formulas such as 
(2n)(n+ 1)(2n+1) 
3 : 
Whereas it can be useful to verify formulas of this type, it is more important and far more 


447 46° +--+ (2n—2)? + (2n)? = 


interesting to actually derive them. We will show how to use our results on polynomials and 
tables of values to derive many formulas like the preceding one. 


We will conclude this chapter by examining the algorithm we are introduced to in calculus that 
allows us to decompose rational functions into the sum of a polynomial and partial fractions. 
Recall that this algorithm is extremely useful in finding integrals of rational functions and is 
also often used to compute the sum of an infinite series. However, in calculus courses, it is 
never explained why this algorithm works. Ultimately, this algorithm produces a system of 
linear equations. Therefore, it is tempting to believe that the underlying reason that this 
algorithm works is based on some facts from linear algebra. However, it may come as 
somewhat of a surprise that the keys to partial fraction decomposition are actually the division 
algorithm and the Euclidean Algorithm in the polynomial ring R[x]. 


13.1 Difference Functions 
Let us begin by looking at the following table: 


x: 0 1 2 3 4 5 6 7 8 
F(x): 93 15 —9 21 105 243 435 681 981 


It is quite natural to wonder if the preceding table can be produced by a polynomial. As it turns 
out, this table can be produced by a polynomial, and the smallest-degree polynomial that does 
the trick is 27x” — 105x +93. This example helps to motivate a two-part question. First, is 
there an easy algorithm to determine if a table of values was produced by a polynomial? Next, 
if the table was produced by a polynomial, how can we find a polynomial that does the job? 
The key to solving these problems will be to introduce a class of functions we will refer to as 
difference functions. 


Definition 13.1. Given a function f(x), define f(1)(x), the first difference function of f(x), 


as fay(x) = fx+1) — f(x). Ifn = 1, define fins (x) = fir & +1) — fn) (&), and we call 
f(n) (x) the nth difference function of f(x). 


To familiarize ourselves with difference functions, we begin with a few straightforward 
computations. 


Examples 
] 1. If f(x) =5, then fay(x) = faxt+1)— f@&) =5-5=0. 


Difference Functions and Partial Fractions 489 


2. If g(x) =4x —2, then 
8a (x) = gat 1) —g@) = 4@4+ 1-2) — 4x—2) = 4x +2) -— 4x-2) =4. 
Therefore, 
82) (x) = gay(xt 1) — gay(x) =4—-4=0. 
Thus, 8(n)(x) =0, for all n > 2. 
3. If h(x) = 3x2 +2x+5, then 
hay(x) = h(x+1) —h(x) = 3+ 1)? +2(¢+ 1 +5) — Gx? +2x+5) = 
(3x7 + 8x + 10) — (3x7 +2x+5) = 6x45. 
Furthermore, 


hay(x) = hay@+1)— hay) = (6 +1) +5) — (6x +5) = 
(6x+11)—(6x+5) =6. 


Now observe that /in)(x) = 0, for all n > 3. 
4. If j(x) =3*, then 
jay) = ja@+ 1 — j@) =3**' -3* = B-1)-3* =2-3". 


Thus, j(1)(x) is also an exponential function, and it is now easy to see that j(n)(x) is an 
exponential function, for all n. 


The goal of this section is to show that if the nth difference function f(n)(x) is a nonzero 
constant, then f(x) is a polynomial of degree n. As we saw in calculus, if the nth derivative 
of a function is a nonzero constant, then the function is actually a polynomial of degree n. At 
first, it might seem odd that we can prove a result for difference functions that is so similar to 
one in calculus. However, if we reexamine the definition of the first difference function, we 
see that 


fa +1)— f@) 
————— 


fath)— fe) 
h 


fay) = f@t+D—f@)= 


Observe that foe D— fe) is somewhat similar to limy_,¢ , which is the definition of 
the derivative of f(x). If we now look back at the preceding examples, we can see the 
formulas we obtained for difference functions are indeed quite similar to the analogous 
formulas for derivatives. 


www-.elsevierdirect.com 


490 Chapter 13 
Definition 13.2. [fn € N, let G,(x) = (x+ 1)? — x". 


Observe that for any n > 1, G,(x) is the first difference function of the function f(x) = x”. 
Once again, we will do some straightforward computations. 


m= Examples 


Gi(x) = («+ 1)- 

Go(x) = (x +1)?- cane 

eam 1)3—2x3 = 3x7+3x+4+1 

Ga(x) = (x+1)* — x4 = 4x3 + 6x7 +4x41 

hose | 


In the preceding examples, we can see that the leading term of G,,(x) is nx"—!, which is the 
derivative of x”. In light of this, the following lemma should not be too surprising. 


Lemma 13.3 
(a) Gp(x) = 4+ 1)" — x" has degree n—1, for alln €N. 


(b) Jf f(x) € R[x] has degree n, where n = 1, then its first difference function f(1)(x) has 
degreen—1. 


(c) If f(x) € R[x] has degree n, then its nth difference function f(n)(x) is a nonzero constant, 
and its mth difference function f(m)(x) is equal to 0, for all m > n. 
Proof. For part (a), since (x +1)” = x" +nx"—! + terms of lower degree, we have 
G,(x) = (x +1)" —x" = (x" +nx"! + terms of lower degree) — x” = 
nx"—! + terms of lower degree. 
Thus, G,,(x) has degree n — 1. 
For part (b), if f(x) € R[x] has degree n, then 
FX) = yx" +.0_1x" | +--+ ayx+ a0, 
where a; € R and a, 40. Then 
fy@) = fe+D—-fa= 
(tn (x-+ 1)" +om—1@+ 1"! +++ are + 2) +09) — 
(A) X” + yx" | +--+ + ayx+a9) = 
in ((x+ 1)" = 2") Fon (@t DPI =a $e ton (e+) —x) = 
On G(X) + On—1 Gn—1 (x) ++ ++ +01 G1 (x). 


Difference Functions and Partial Fractions 491 


Since each G;(x) has degree i — 1 and a, 4 0, we see that f(1)(x) has the same degree as 
Gy, (x). Thus, f(1)(x) has degree n — 1. 


We know that taking the first derivative of a polynomial drops the degree by 1. In light of part 
(b), we now also know that this is the case when taking the first difference function of a 
polynomial. Therefore, if f(x) € R[x] has degree n, the degree of f(,)(x) must be n less than 
n. Thus, f(n)(x) is a nonzero constant. Furthermore, since the first difference function of a 
constant function is 0, it immediately follows that if m > n, then f(m)(x) = 0, thereby proving 


part (c). 


Recall that our goal is to show that if f(,)(x) is a nonzero constant, then f(x) is a polynomial 
of degree n. Although we are still quite far from this goal, we can observe that Lemma 13.3(c) 
tells us the converse is indeed true. Let us briefly return to the table of values we saw at the 
beginning of this section. If we insert the values of f(1)(x) = f(x+ 1) — f(x) into the table, we 
obtain 


x: 0 1 2 3 4 5 6 7 8 
F(x): 93 15 —9 21 105 243 435 681 981 
fay): —78 —24 30 84 138 192 246 300 


Therefore, Lemma 13.3(c) tells us that since f(1)(x) is not constant, f(x) cannot be linear. We 
will return to this example later in this section. 


When looking at tables of values, it often suffices to examine functions whose domain is not 
necessarily the entire set of real numbers but is instead a subset of the integers. Throughout 
this chapter, we will let No denote the set N U {0} and will often only concern ourselves with 
functions defined on No. Recall from calculus that if two functions have the same first 
derivative, they need not be the same function, but they must differ by a constant. As a result, 
if two functions have the same first derivative and agree at a single point, then they must be the 
same function. Given the similarity between difference functions and derivatives, it is not 
surprising that the analogous result also holds for difference functions. 


Lemma 13.4. Let fg: No — R be two functions such that f(1)(x) = gqy(X), for all x € No, 
and f(0) = g(O). Then f(x) = g(x), for all x € No. 


Proof. Tf fa)(x) = gqy(x), for all x € No, and f(0) = g(0), then let 


T= {n €Nol f(1) = g()}. 


It suffices to show that T = No, and we will proceed by using Mathematical Induction. Since 
f(O) = g(0), it is certainly the case that 0 € T. Therefore, we must now show that if 7 contains 
some k € No, then T also contains k + 1. Observe that 


fn) = fk+1I)— fk) and gayk)=gkt+1)— gh). 


492 Chapter 13 


The previous equations immediately imply that 


(1) SKF = fy&+ fk) and gktl=gayh)+sh). 


However, since k € T, we know that f(k) = g(k). Combining this with the fact that 
Fay(k) = gay (k), the equations in (1) tell us that f(k + 1) = g(k +1). Therefore, k+ 1 € T, as 
desired. 


The next lemma indicates that any polynomial in R[x] can be rewritten in terms of 
polynomials of the form G,,(x). 


Lemma 13.5. Jf f(x) € RL] has degree n then there exist a1, ...,Qn41 € R with ay4, 40 
such that f(x) = 0G{(x) +--+ +0n41Gn41(%). 


Proof. We will proceed by using the Second Version of Mathematical Induction. We begin by 
letting 


T = {n € Nolall polynomials of degree n have the desired property}. 
It will suffice to show that T = No. 


We first need to show that 0 € 7. If f(x) € R[x] has degree 0, then f(x) = a1, for some 
nonzero a; € R. Since G;(x) = 1, we immediately see that f(x) = a,G,(x). Thus 0 € T. 


Now suppose that k € No such that every element of the set {0,...,k} belongs to T. To 
conclude our proof, we need to show that k+1 € 7. Therefore, we need to show that if 
f(x) € R[x] has degree k + 1, then there exist a, ..., @%41, @x+2 € R such that 


S(X) = 1 Gy (xX) ++ HK 1 Grp (%) + Oe 42GK42 (x). 


Let a denote the leading coefficient of f(x), and then let ay42 = aoe Observe that a,4 is 
certainly nonzero. Furthermore, since G;+2(x) has degree k + 1 with leading coefficient k + 2, 
it is clear that a,42Gx42(x) has the same degree and leading coefficient as f(x). If 

F(X) = Ae+2Gx42(x), then we are done. On the other hand, if f(x) 4 a42Gx42(x), then 

(x) — Op42G442(x) is a nonzero element of R[x] of degree less than k + 1. If we let t denote 
the degree of f(x) — a442G442(x), then t € {0,...,k} C T. Since t € T, there exist 
Q1,...,@;41 € R such that 


SX) = Op 42GK42(%) = 1 Gy (x) +++ +0741G 41 (4). 


Finally, if we let a, = 0, for all m such that t+ 1 <m <k-+2, the previous equation implies 
that 


(xX) = Gy (xX) +++ Fo 1 Gap (X) + OK 42GK42(x). 


Thus, k+ 1 € T, thereby concluding the proof. 


Difference Functions and Partial Fractions 493 


We can now state and prove the first main result of this chapter. 


Theorem 13.6. Jf f : No > R is a function such that its nth difference function, f(n)(x), is a 
nonzero constant, then f(x) is a polynomial of degree n. 


Proof. We proceed with a proof by induction. To this end, let 


T = {k € N|every function whose kth difference function 


is a nonzero constant is equal to a polynomial of degree k}; 


and we will use Mathematical Induction to show that T = N. In order to show that 1 € T, let us 
suppose that /| (x) =a, for all x € No, where a is a nonzero real number. At this point, we 
would expect that f(x) is a linear function with slope a and y-intercept f(0). Thus, it makes 
sense to try to prove that f(x) is equal to the linear function ax + f(0). Observe that the first 
difference functions of both f(x) and ax+ f(0) is the constant function a. Furthermore, 
plugging x = 0 into both f(x) and ax + f(O) gives the value f(0). Therefore, Lemma 13.4 tells 
us that it is indeed the case that f(x) = ax+ f(O). Thus, 1 € T. 


Now let us suppose that k is a positive integer that belongs to T; we must show that k+ 1 also 
belongs to T. Therefore, suppose that f(,41)(x) =a, for all x € No, where a is a nonzero real 
number. We need to show that f(x) is equal to some polynomial of degree k + 1. Observe that 
if we let g(x) = f(1)(x), then it immediately follows that 


Si (X) = fer) =a. 


Since the kth difference function of g(x) is a nonzero constant and k € T, we know that g(x) is 
equal to some polynomial of degree k. In light of Lemma 13.5, there exist a1,...,a%41 € R 
with a4; 40 such that 


8(X) = O41 Gegi (x) + OK GE (xX) +++ + +01 Gj (x). 
Next, let 
F(x) = age + opxk + --stayx. 
Then 
Fay) = F@+1)—-— F@) = 
(eg (et IT + (x + 1) +++ Fo (e+ 1)) = ng ®t! + ayxk +--+ tox) = 
ost I — xh) + ona t DT x) 4+---+ar(@+)—x) = 
O41 GE 41 (X) +0. GE (x) +--+ +01G1 (x) = g(x) = fay). 


As aresult, f(x) and F(x) have the same first difference function. 


494 Chapter 13 


Now, let H(x) = F(x) + f(O). Since az41; 40, A(x) is a polynomial of degree k + 1. It is easy 
to see that 


Aay(x) = Fay) = fay). 
Furthermore, since F(0) = 0, we have 
H(O) = F(O) + f() =0+ f(O) = Ff). 


Thus, H(x) and f(x) have the same first difference function and also have the same value 
at O. Therefore, Lemma 13.4 tells us that f(x) = H(x) and so, f(x) is a polynomial of degree 
k+1.Hence, k+1 € T, as desired. 


We can now use Theorem 13.6 to show that the table of values at the beginning of this section 
is not only produced by a polynomial but is produced by a quadratic. Having already inserted 
Fa)() onto the table and seen that f(x) is not linear, we now insert f(2)(x) = fay(x+ 1l)— 
f(1)(%) onto the table. 


x: 0 1 2 3 4 5 6 7 8 
F(x): 93 15 —9 21 105 243 435 681 981 
Fay (x): —78 —24 30 84 138 192 246 300 

Fa (x): 54 54 54 54 54 54 54 


Since f(2)(x) is a nonzero constant, Theorem 13.6 asserts that our table can be produced by a 
polynomial of degree 2. Thus, f(x) = ax* + bx +c, for some a, b, c € R. To find the values of 
a, b,c, we will plug three of the values of x from our table into f(x). Any three choices of x 
will work, and if we use the values x = 0, 1, 2 we obtain 


93 = f(0) = a(0)? +b(0) +c =c 
15 = f(l) =a(1)* +01) +c =a+b+e 
—9 = f(2) =a(2)*+b(2) +c=4at+b+e. 


Therefore, we obtain the three linear equations 


c= 93 
a+b+c=15 
4a+2b+c=-9. 
Solving these three equations in three unknowns, we see that ad = 27, b = —105, c = 93. Thus, 
f(x) = 27x? — 105x +93 produces the table. Also note that f(x) is the only quadratic 


polynomial that produces this table. To see this, let us suppose that g(x) is also a quadratic 
polynomial that produces this table, and we will show that f(x) = g(x). To accomplish this, let 


Difference Functions and Partial Fractions 495 


F(x) = f(x) — g(x); it will then suffice to show that F(x) = 0, for all x. It is certainly the 
case that F(x) is a polynomial whose degree cannot exceed 2 with the additional properties 
that F(O) = F(1) = F(2) = 0. However, Corollary 12.16 asserts that if a polynomial p(x) 
has three distinct roots, then either p(x) has degree at least 3 or p(x) is always equal to 0. 
As a result, F(x) is always equal to 0, so F(x) = f(x) — g(x) = 0 and so f(x) = g(x). 
Thus, f(x) = 27x? — 105x +93 is the only quadratic polynomial that produces this table of 
values. 


At first glance, Theorem 13.6 may not appear to handle as general a situation as was described 
in Chapter 1. Note that Theorem 13.6 only deals with functions whose domain is the set 

No = {0, 1, 2,3, ...}. Furthermore, Theorem 13.6 only dealt with the case where x continually 
increased by 1, whereas in Chapter 1, we discussed the situation where x continually changed 
by a fixed real number a. Fortunately, it is easy to generalize Theorem 13.6 to this more 
general situation. Therefore, we will now state a more general version of Theorem 13.6 that, as 
we will see, has the advantage that it can be used more often. 


Corollary 13.7. Let f(x) be a real-valued function whose domain is the set {b, b+a, b+ 
2a,b+3a,b+4a,...}, where a,b € Randa ¥0. In this case, let f)(x) = f(x+a)— f(x) 
and finti)(x) = fx) (x +a) — fn) (), for n = 1. If the function fi) (x) is a nonzero constant, 
then f(x) is a polynomial of degree n. 


Proof. Let g(x) = f(b+a-x); therefore, the domain of g(x) is No. If we define the difference 
functions of g(x) as in Definition 13.1, and if the difference functions of f(x) are as in the 
statement of this corollary, we have 


8 @) = ga+1)—g@) = fbtaatl))— fota-x)= 
f(bt+a-x)+a)— flb+a-x) = fayb+a-x). 


Repeating the same argument shows us that g(2)(x) = f(2)(b+a-x). Itis now easy to see 
the fact, which we could have proved formally using Mathematical Induction, that 

8n)(X) = fn) (b +a-x), for all n => 1. Since the nth difference function of f(x) is a nonzero 
constant, the same is now true of g(x). Therefore, we can apply Theorem 13.6 to conclude that 
g(x) is a polynomial of degree n. Thus, f(b+a- x) is also polynomial in x of degree n. If we 
make the substitution y = b+a-x, then x gets replaced by ss and we see that f(y) isa 
polynomial in y whose degree in y is also n. However, this is the same as saying that f(x) is 
also a polynomial of degree n in x. 


We can now use Corollary 13.7 to revisit the table that appeared at the beginning of this 
chapter: 
x: —9 —7 = 5 —3 —1 1 3 5 7 
g(x): —51 —16 1 6 5 4 9 26 61 


496 Chapter 13 


Since the variable increases by 2 through the table, we can let g(1)(x) = g(x +2) — g(x) and 
intl) (X) = B(n) (xX +2) — gq~m) (x), for n = 1. Adding g(1)(x) to the table, we obtain 


x: -—9 —7 —5 —3 —1 1 3 5 
g(x): —51 —16 1 6 5 4 9 26 61 
81) (x): 35 17 5 —l —1 5 17 35 


Since g(1)(x) is not a constant, we know that the table cannot be produced by a linear function. 
Therefore, we next add g(2)(x) to the table to obtain 


x —9 =f =5 —3 —l 1 3 5 7 
g(x): —51 —16 1 6 5 4 2 26 61 
8 (x): 35 17 5 —1 —1 5 17 35 
g(x): —-18 —-12 -6 0 6 12 18 


Observe that g(2)(x) is also not a constant. Thus, the table cannot be produced by a quadratic 
function and we add g(3)(x) to the table. 


x: —9 —7 =5 —3 —1 1 3 5 7 
g(x): —51 —16 1 6 5 4 9 26 61 
8 (x): 35 17 5 —1 —1 5 17 35 
ga(x): 18 12 -6 0 6 12 18 

&(3) (x): 6 6 6 6 6 6 


Since (3) (x) is a nonzero constant, Corollary 13.7 now asserts that the table can be produced 
by a cubic polynomial. Thus, g(x) = ax* + bx* + cx +d. By plugging in any four values of x 
from the table, we will obtain four linear equations that will allow us to find a, b, c, d. In order 
to deal with smaller numbers, we will let x = —3, —1, 1, 3 and then obtain 


6 = a(—3)? +. b(-3)* +. c(—3) +d = —27a4 9b —3c+4d, 
5 =a(—1)? +b(-1)? +c(-1) +d = —a+b-—c+d, 
4=a(1)? +b)? +c) +d =a+b+c+d, 

9 = a(3)? + b(3)* +.c(3) +d = 27a4- 9b + 3c4+d. 


Therefore, we have obtained the four equations 


—27a+9b—3c+d=6 
—a+b—c+d=5 
a+b+c+d=4 
27a+9b+3c+d=9. 


Difference Functions and Partial Fractions 497 


Solving these equations yields a = e b= 2, cS 2, d= 2. Hence, the table can be produced 
by the cubic polynomial x3 + ax? - ax+ 2. It then follows, by Corollary 12.16, that this is 
the only cubic polynomial that produces the table. 


In high school algebra courses, we learned that two data points determine a line. Earlier in this 
section, we used three data points to find a quadratic and four data points to find a cubic. We 
will now generalize these and prove that given any n data points, there always exists a unique 
polynomial of degree less than n that agrees with all n data points. 


Proposition 13.8. Let a1,...,@n, B1,---; Bn € R, where all the a; are different, Then there 
exists a unique f(x) € R[x] of degree less than n such that f(a) = Bi, f(a@2) = Bo, ..., 
(Gn) = Bn. 


Proof. For eachi <n, let 
Fy (x) = (% — @1)(% — @2) +++ (% — aj_-1) & — O41) ++ (Xp). 


In other words, H;(x) is the product of all the x — a; except for x — @;. Next, let y, = H(aj); 
since all the a; are different, we see that y; # 0. Also note that if i 4 j, then Hj(@;) = 0. 


For each i < n, consider the polynomial Biyi' Hj (x). It is easy to see that Biyi~' Hi(a;) = Bi 
and 6:7! H; (a;) =0, for all j #i. Therefore, if we let 


fx) = Bir! Ay (x) + Boy2” | Ha (x) +++ Bn Yn! An (2X), 


then 


f(a) = Biv | Wi (ay) ++ + Bi-1v%4-1 | 1 (ai) + 
Bivi | Hi(ai) + Bis yin | Aisi (ai) +-+++ Ban | An (ai) = 
O+-+-04+6;+0+---+0= 8; 


for all i <n. Furthermore, since each H;(x) has degree n — 1, the degree of f(x) is less than n. 


All that remains is to show that f(x) is the only polynomial of degree less than n such that 
f(a) = B;, for all i < n. Let g(x) also be a polynomial of degree less than n such that 
2(a;) = §;, for all i < n. We need to show that f(x) = g(x). If we let F(x) = f(x) — g(x), then 
F(a;) = 0, for all i <n. Therefore, F(x) has at least n distinct roots and Corollary 12.16 asserts 
that F(x) must either be 0 or have degree at least n. But since the degrees of f(x) and g(x) are 
less than n, the degree of F(x) = f(x) — g(x) cannot possible be as large as n. Hence, 
F(x) = f(x) — g(x) =0 and f(x) = g(x), as desired. 


We now have two different ways to find a polynomial that agrees with n data points. One is to 
construct the polynomial as in the proof of Proposition 13.8, whereas the other is to set up a 


498 Chapter 13 


system of linear equations as in the example preceding Proposition 13.8. We now provide an 
example illustrating both techniques. 


m Example 


Find a polynomial f(x) of degree less than 3 such that f(—1) =2, f(1) =6, f(4) = 87. 
Applying the technique used in the proof of Proposition 13.8, let 


fx) =2y 71! @-DA-4) +671 + D4) +8737 (e+ I(x 1), 
where 
nu=(-D)-D)(-1)-4 = 10, »w=0+D0—-4) =-6, 
3=(44)D)4-D=15. 


Plugging the value of each y; into the preceding equation for f(x), we obtain 
1 29 
fx) = gs a~-4)-@+)D@-4)+ sot I)(x— 1). 


Converting f(x) to the more familiar form of a quadratic gives us f(x) = 5x? +2x—-1. 


In our second technique, we use the fact that f(x) = ax* +bx+c, for some a,b,c ER. 
Plugging x = —1,1,4 into f(x), we obtain 


2= f(-1) =a(-1)?+ b(-1) +e =a—b+e, 
6= f(1) =a(1)*?+db(1) +c=a+b+e, 
87 = fi4) =a(4)+b(4) +c = 16a+4b +c. 
Solving the three linear equations 
a—b+c=2, 
at+b+c=6, 
l6a+4b+c=87, 


gives usa =5,b=2, andc=~—1. Thus, once again, f(x) = 54°42 — 1, 


In the preceding example, the second technique produced three linear equations in three 
unknowns. More generally, if we wished to find a polynomial of degree less than n that took 
on n prescribed values, this technique would produce n linear equations in n unknowns. 
Although a system of linear equations can have no solutions or an infinite number of solutions, 
Proposition 13.8 guarantees that the system we obtain in this situation does have a unique 


www-.elsevierdirect.com 


Difference Functions and Partial Fractions 499 


solution. Therefore, Proposition 13.8 is useful even when we use a technique for finding the 
polynomial other than the technique that appears in its proof. 


13.2 Polynomials and Mathematical Induction 


As an application of Corollary 13.7, we will show how to derive many of the formulas we 
were asked to verify when learning about Mathematical Induction. Several examples of these 
types of formulas are: 


1 
142434--(n—-Dtna et ) 
1)(2n+1 
(pe sao es i n+l) 
ne ne eee 
4 ) 


14345+-+-+(2n—3)+Qn-1 =n? 


Whereas Mathematical Induction is a wonderful tool for proving the validity of these 
formulas, it does not address the much more important question of where these formulas come 
from. After all, the real excitement and challenge is in finding these formulas. 


Observe that in the first formula, we are adding terms obtained from plugging natural numbers 
into the linear function p(x) = x and the formula on the right side is a quadratic. In the second 
formula, we are adding terms obtained from plugging natural numbers into the quadratic 
function p(x) = x? and the formula on the right side is a cubic. Similarly, for the third formula, 
we are adding terms obtained from plugging natural numbers into the cubic p(x) = x° and the 
right side is a quartic. At this point, it is reasonable to ask the following question: If p(x) is a 
polynomial of degree f, is the formula for p(1) + p(2) + p(3)+---+ p(n—1)+ p(n) always a 
polynomial of degree t+ 1? We can now use Corollary 13.7 to answer this question in the 
affirmative. 


Corollary 13.9. If p(x) € R[x] is a polynomial of degree t and k € No, define the function f as 
f(a) = p(k) + p+ 1) + p(R +2) 4+ pk+3)+---+ p(—1)+ pn), 

where n € No andn > k. Then f is a polynomial of degree t +1. 

Proof. Observe that 


fay) = fatl— fa = 
(p(k) + p(k +1) + p(K+2)+---+ p(n—1)+ p(n) + p(nt1))- 
(p(k) + p(k +1) + p(k+2)+---+p(—1)+ p(n)) = p+). 


500 Chapter 13 


Since p(x) has degree f, it follows that fi1)(n) = p(n + 1) also has degree t. However, since 
the first difference function of f is a polynomial of degree t, the same calculations as in 
Lemma 13.3 tells us that the t+ 1st difference function of f is a nonzero constant. 
Corollary 13.7 now asserts that f is indeed a polynomial of degree t+ 1. 


We will now work through several examples. 


m Example 
Let us find a formula for the function 
fin) =54+8+4+114+---+G8n-—1)+Gn+4+2). 


Since f is obtained by adding values of the linear function p(x) = 3x +2 starting with 
p(1), Corollary 13.9 tells that f is a quadratic. Therefore, there exist a, b,c € R such that 


(2) f(a) = an? +bn+c. 


As in Proposition 13.8, to find a, b, c, we will need three values of f(7). Since we know 
that f(1) =5, f(22) =54+8= 13, f(3) =54+8411 =24, we can plug n = 1, 2, 3 into (2) 
to obtain 
S=a+b+c 

13 =4a+2b+c 

24 = 9a+3b+c. 
We have reduced this problem down to solving a system of three linear equations in three 
unknowns. It is now easy to determine that the unique solution is a= 3, p= i, c=0. As 
a result, we have derived the formula 
3n?+7n 


5+8+4+11+---+Gn—1)+ (3n+2)= 5 


Hopefully, Corollary 13.9 and the preceding example have helped to take the mystery out of 
some of the formulas that appear when we are learning to use Mathematical Induction. As we 
use Corollary 13.9 to derive various formulas, the last step is always to solve a system of 
linear equations. These systems of equations can sometimes be rather messy and quite difficult 
to solve by hand. However, as we can soon see, by using difference functions, we can always 
reduce these problems down to a system of linear equations which is extremely easy to solve. 
In the next example we will derive the appropriate formula in two ways. The second technique 
will illustrate how using difference functions can greatly simplify the computations. 


www-.elsevierdirect.com 


Difference Functions and Partial Fractions 501 


m Example 


Find a formula for the function 
g(n) = 17437 +4+574+..-4+ (Qn—3)*+ Qn—1)’. 


Since g(n) is obtained by adding values of the quadratic p(x) = (2x — 1)? starting with 
pC), Corollary 13.9 tells us that g(7) is a cubic. As a result, there exist a, b,c, d € IR such 
that 


(3) g(n) = an? +bn? +cn+d. 


Using the same technique as in the previous example, by looking at four values of g(n), 
we will obtain four linear equations in a, b, c,d. Since g(1) = 17 = 1, g(2) = 1° +37 = 10, 
g(3) = 1° +37 +5? = 35, and g(4) = 174+. 37 +5°+7? = 84, we can now plugn = 1,2, 
3, 4 into (3) to obtain 
l=a+b+c+d 
10 = 8a+4b+2c+d 
35 =27a+9b+3c+d 
84 = 64a+ 16b+4c+d. 


Solving these four equations requires some work, but eventually you will obtain the 
solution a = ;, b=0,c= —t, d =0. Thus, 


4 1 
174637? 5? be On 37 + Cn 17 = qi — an. 


However, we can use difference functions to produce four linear equations involving 
a,b, c,d that are much easier to solve. In light of Corollary 13.9, we know that 
g(n) = an? +bn?+cn+d. Taking the first difference function of g(n), we obtain 


gay) = g@+1)—gi) = 
(an +1)? +b(n +1)? +c(n+ 1) +d) — (an? + bn? +cn+d) = 
a((n+1)? =n?) +b((n+ 1) =n’) +e(n+ 1-1) = 
a(n? +3n? +3n 41) —n®) + b(n? +2n41) =n?) +e(n+-D—1n) = 
(4) a)n* + (3a+2b)n+(atb+c). 


On the other hand, as we saw in the proof of Corollary 13.9, gi1)(n) = p(n + 1). Since 
p(x) = (2x — 1)?, we obtain 


(5) gay(n) = 2(n+1)— 1)? = Qn4+1)? = 4n? +4n +1. 


www.elsevierdirect.com 


502 Chapter 13 


By comparing the coefficients of the various powers of n in equations (4) and (5), we 
obtain the linear equations 


3a=4 
3a+2b=4 
a+b+c=l. 


Observe that this system of linear equations is extremely easy to solve. The first equation 
tells us that d= ;. Plugging the value of a into the second equation yields b = 0. Then 
plugging the values of a and b into the third equation gives us c= —i. Finally, letting 
n= 1 inthe formula g(n) = an? + bn? +.cn+d results in the equation 

a+b+c+d=1. 
But having already found a, b, c, this equation immediately tells us that d = 0. Therefore, 
we have once again—this time with much less work—derived the formula 

4 


1 
[24-3 45? Es On 3) + On 1) = qr ah 


As the degrees of the polynomials involved increase, so does the amount of work saved by 
using difference functions. This will become clearer in the next example. 


m Example 
Find a formula for 
Fn) = 1* 427 437 4 nn*, 


In light of Corollary 13.9, we know that F will be a polynomial of degree 5. Thus, we 
need to find a, b,c, d,e, f € IR such that 


(6) F(n) = 144+24434+.---4+n4 = an? +bn*+cn?+dn?>+en+ f 
Taking the first difference function of F, we obtain 
Fa) = Fat+1)-Fa)= 
(apLly +bG4+ 1) +e Ty det 1 +eG41)4+p)— 
(an? + bn* + cn? +dn* +en+ f) = 
a((n +1)? —n°) + b(n + 1)* —n4) +e(a +1)? — 0?) +d((n +1)? =n") + 
e((n +1) —n) = a(5n* + 10n? + 10n? +5n +1) +b(4n3 + 6n? +4n + 1)+ 
cn? +3n+1)+dQn +1) +e = (Sa)n* + (10a + 4b)n? + (10a + 6b + 3c)n?+ 

(7) (5a+4b+3c+2d)n+(a+b+c+d+e). 


www-.elsevierdirect.com 


Difference Functions and Partial Fractions 503 


On the other hand, since g(1)(n) = p(n +1) and p(x) = x*, we obtain 
(8) gy (n) = (n+1)4 =n*+4n? +6n?4+4n+4+1. 


By comparing the coefficients of the various powers of n in equations (7) and (8), we 
obtain the linear equations 
Sa=1 
10a+4b=4 
10a+6b+3c=6 
5Sa+4b+3c+2d=4 
a+b+c+dt+e=l1. 


Going in order from the top linear equation to the bottom, we can now easily determine, 


t,b=4,c=4,d=0,and 


e= — 35. Finally, by letting m = 1 in equation (6) we see that 


in sequence, the values of a, b, c, d, e. It follows that a = 


a+b+c+d+e+f=1. 


Having already found the values of a, b, c, d, e, this equation immediately tells us that 
f =9. Thus, 


1 
n. 
30 


1 1 1 
144244344..-4n4 = ge hae that = 


On the other hand, if we attempted to solve this problem by letting n = 1,2, 3,4,5, 6, in 
equation (6), then we would obtain the following system of linear equations: 
a+b+c+d+e+f=1 
32a+ 16b+8c+4d+2e+ f=17 
243a+ 81b+27c+9d+3e+ f =98 
1024a+ 256b+ 64c + l6d+4e+ f = 354 
3125a+ 6256+ 125c+25d+5e+ f = 979 
77716a+ 1296b + 216c + 36d +6e+ f = 2275. 

Certainly, this system of linear equations is much more complicated than one obtained 


using difference functions. Thus, Corollary 13.9 and difference functions do provide us 
with a fairly easy way of deriving many important formulas. 


www.elsevierdirect.com 


504 Chapter 13 


Exercises for Sections 13.1 and 13.2 


1. 


(a) 


(b) 


(a) 


(b) 


(a) 


(b) 


(a) 


(b) 


(a) 


(b) 


(a) 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x: —6 =) —4 —3 —2 —1 0 1 2 
F(x): 83 72 61 50 39 28 17 6 =) 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 


x: —22 —-14 —6 2 10 18 26 34 42 
g(x): —26.5 —24.5 —22.5 —20.5 —18.5 —-16.5 —14.5 —12.5 —10.5 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 


x: 3 4 5 6 7 8 9 10 11 
h(x): 24 24 22 18 12 4 —6 —18 —32 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 


x: —8 =5 —2 1 4 7 10 13 16 
p(x): 289 163 73 19 1 19 73 163 289 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x —10 —6 —2 2 6 10 14 18 22 
q(x): 293 129 29 —7 21 113 269 489 773 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


10. 


11. 
12. 
13. 
14. 
15. 


(b) 


(a) 


(b) 


(a) 


(b) 


(a) 


(b) 


(a) 


(b) 


Difference Functions and Partial Fractions 505 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x! —23 -18 —13 —8 -3 2 7 12 17 
f(x): —1461 -871 —431 —-141 -1 —-I1l —-I171 —481 —941 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x: —2 —1 0 1 2 3 4 5 6 
g(x): —19 =2 1 2 13 46 113 226 397 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x: —19 —14 —9 —4 1 6 11 16 21 
h(x): —6944 —2809 -—774 -89 -4 231 1366 4151 9336 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x: —8 —5 —2 1 4 7 10 13 16 
p(x): —-1205 -314 -17 10 91 #550 1711 3898 7435 


Use difference functions to determine the smallest degree of a polynomial that can 
produce the following table. 


Then find the polynomial of smallest possible degree that does produce the 
following table. 
x: —7 —5 —3 —1 1 3 5 7 9 
q(x): 362 154 66 5058 42 —46 —254 —630 


If Fm) =14+54+9+---+ (4n —3), write F(n) as a polynomial. 


If Gm) =84+134+18+4+---+ (Sn —7), write G(m) as a polynomial. 


If Fm) = 1-24+2-343-44+---+n(n+ 1), write F(n) as a polynomial. 


If Ga) = 1-24+3-44+5-64+---+(2n—1)(2n), write G(n) as a polynomial. 


If Wm) = 1-14+2-34+3-5+---+(n)(2n-—1), write W(7) as a polynomial. 


506 Chapter 13 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


If Tin) = 1-2-34+2-3-443-4-5+---+()n+1)(+4+2), write T(n) as a polynomial. 


If Sv) =1-2-343-4-545-6-7+---+ (Qn— 1)(2n)(2n + 1), write S(n) as a 
polynomial. 


If A(n) = 1-2-344-5-64+7-8-9+4+---+ 3n —2)(3n — 1)(n), write H(n) asa 
polynomial. 


If F(n) = ys +4i), write F(n) as a polynomial. 

If Gn) = or — 2i), write G(n) as a polynomial. 

If H(n) = Lae + 5i), write H(n) as a polynomial. 

If Sa) = pean (2i* —7i+5), write S(n) as a polynomial. 

If D(n) = 144+24434+.---+n74, write D(n) asa polynomial. 
If Tin) = 19 +254+3°+---+n°, write T(n) as a polynomial. 
If Wn) = 1° 4+ 24 3°+.---+n°, write W(n) asa polynomial. 


On an 8 x 8 checkerboard, there are 64 squares each with dimension | x 1. The 
checkerboard also contains 2 x 2 squares, 3 x 3 squares, all the way up to an 8 x 8 
square. What is the total number of squares on the checkerboard? 


Repeat exercise 26, but instead suppose you had a 10 x 10 checkerboard. In this case, 
what is the total number of squares on the checkerboard? 


We will now generalize exercises 26 and 27. If you have an n x n checkerboard, what is 
the total number of squares on the checkerboard? 


In exercises 29-38, we will examine additional similarities between difference functions and 
derivatives. Throughout these exercises, we will make frequent use of the definition of 
difference functions that appeared in Definition 13.1. 


29. 


30. 


31. 


If fin)(x) = 0, where n € N, show that there exist a, a, ...,@n—1 € R such that 
f(x) = Opin” +---+ajyx+aq. 


Show that if f(x), g(x) are functions such that f(n)(x) = gi~)(x), where n € N, then there 
exist Mg, @1,...,@,—1 € R such that f(x) = g(x) ee gx tae par ae, 


Let f(x) = aa’, wherea,aeé R,a 40,a>0,andaF£l. 
(a) Show that f(1)(x) = a(a—1)a’. 


(b) Show that, for alln EN, finy(x) = a(a—1)"a". 


Difference Functions and Partial Fractions 507 


32. Use exercise 31 to find the value of a € R such that f(x) = aa* has the property that 
fay) = fO). 


33. Let f(x) be a function such that f(1)(x) = f(x) and let g(x) = fx 


(a) 


34. (a) 


(b) 


(c) 


ae 
Show that &(1) (x) = 0. 


Use Lemma 13.4 or exercise 29 to conclude that f(x) = a2*, for some a € R. 
Show that if f(x) = sin(x), then f(1)(x) = asin(x) + Bcos(x), for some a, 6 € R 
and determine the exact values of a, £. 


Show that if g(x) = cos(x), then g(1)(x) = ysin(x) + dcos(x), for some y, 6 € R and 
determine the exact values of y, 6. 


If A, Be R,n EN, and F(x) = Asin(x) + Bcos(x), show that there exist a,b € R 
such that F(n)(x) = asin(x) + bcos(x). 


In exercises 35-38, you will need to find functions that produce a given difference function. 
This is analogous to finding antiderivatives in calculus. 


35. (a) 
(b) 
36. (a) 
(b) 
37. (a) 
(b) 
38. (a) 
(b) 


Use exercises 30 and 31 to find all functions f(x) such that f(1)(x) = 10°. 

Use part (a) to find the unique function f(x) such that f(1)(x) = 10* and f(O) = 3. 
Use exercises 30 and 31 to find all functions g(x) such that g(1)(x) =e”. 

Use part (a) to find the unique function g(x) such that g(1)(x) = e* and g(0) =7. 
Find all functions f(x) such that f(2)(x) = 3”. 

Find the unique function f(x) such that f(2)(x) = 3*, f(0) = 10, and f(1)(0) = 21. 
Find all functions g(x) such that g(2)(x) = 15”. 

Find the unique function g(x) such that g(2)(x) = 15*, g(0) = 19, and gi) (0) =51. 


In exercises 39-42, we will make reference to the functions G,,(x) that were defined in 
Definition 13.2. 


39. (a) 


40. (a) 


Find a, 6 € R such that aG2(x) + BG; (x) = 6x — 10. 

Use part (a) to find all functions f(x) such that f(1)(~) = 6x — 10. 

Find the unique function f(x) such that f(1)(x) = 6x —10 and f(0) = 28. 
Find a, 6 € R such that aG2(x) + BG, (x) = 9x — 16. 

Use part (a) to find all functions f(x) such that f(1)(x) = 9x — 16. 

Find the unique function f(x) such that f(1)(x) = 9x —16 and f(0) = —I1. 


508 Chapter 13 
41. (a) Finda, B, y € R such that aG3(x) + BG2(x) + yG\ (x) = x’. 
(b) Use part (a) to find all functions f(x) such that fi) (x) = x?. 


(c) Find the unique function f(x) such that f(1)(x) = x? and f(0) = 1987. 
42. (a) Finda, B, y € R such that wG3(x) + BG2(x) + yG1 (x) = 2x* —9x+7. 
(b) Use part (a) to find all functions f(x) such that f(1)(x) = 2x? —9x +7. 
(c) Find the unique function f(x) such that f(1)(x) = 2x? —9x+7 and f(0) = —44. 


Some of our previous exercises dealt with finding formulas for the sums of values of 
polynomials. In exercises 43 and 44, we will use difference functions to study sums of values 
of functions involving factorials and exponential. 


43. Let Sm) = 1-1!4+2-2!43-3!+---+n-n!. 
(a) Find the formula for S(1) (7). 


(b) You might suspect that the formula for S(n) involves factorials that are larger than 
n!. In light of this, you might conjecture that the formula for S(m) involves (n+ 1)!. 
Therefore, in an attempt to find a formula for S(), let T(n) = (n+ 1)! and find the 
formula for 7,1) (7). 


(c) Compare the formulas for S(1)(”) and 7,1) (1). 


(d) Use part (c) along with exercise 30 and the fact that 7(1) = 2 to find the formula for 
S(n). 


44. (a) Let F(n)=1+at+a?+---+a", where ae Randa #1. Find Fyy(n). 


(b) You might suspect that the formula for F(n) involves a"*!. Therefore, let 
G(n) =a"*! and compare Fi1)(n) and Gq)(n). 


(c) Use part (b) to show that F(n) = a"! +a, for some a é€ R. 
(d) Use the fact that F(O) = 1, to find aw from part (c). 


(e) Write down the formula you have derived for F(n) and then compare it to the 


formula 1+a+a?+---+a"= Pee you have likely seen in precalculus and 


calculus courses. 


In exercises 45-54, we will let 
F(n) = p(0) + p(a+ p2)a* +--+ + p(nya", 


where n > 0,a € R, a £1, and p is a polynomial of degree t. We will use difference functions 
to find a formula for F(1). To do this, for every n € No, let H,,(x) = a(x+ 1)” — x”. Observe 
that the functions H,, (x) are similar to the functions G,,(x) defined in Definition 13.2. 


Difference Functions and Partial Fractions 509 


45. Show that H,,(x) has degree n, for all n € No. 


46. Show that for every polynomial f(x) € R[x] of degree n, there exist a, a@1,...,@, such 
that f(x) = ao Ho(x) +a), A, (x) +---+a,HA,(x). (Hint: Look at the proof of Lemma 
13.5.) 


47. Use exercise 46 to show that for every polynomial f(x) € R[x] of degree n, there exists 
h(x) € R[x] of degree n such that f(x) = ah(x+1)—h(x). 


48. Show that Fq)(n) = p(n+Va"™!. 


49. In light of exercise 48, one might guess that F() is of the form g(n)a"*!, for some 
polynomial g. Therefore, let T(n) = g(n)a"*!, where g is a polynomial, and then 
compute 7,1) (7). 


50. Use exercise 47 to show that there exists some h(x) € R[x] such that T(x) = h(x)a"t! 
has the property that 7(1)(x) = Fi) (7). 


51. Use exercises 30 and 47 to show that there exists a polynomial h(x) of degree ¢ and some 
a € R such that p(0) + p(l)a+ p(2)a?7 +--+ p(n)a" = h(n)a"*! +a. 


52. Use exercise 51 to find a formula for the sum 0-2°+ 1-2!4+2-2?4---+n-2". Observe 
that by letting n = 0, 1,2, you can generate three linear equations that will enable you 
to find the coefficients of h and the value of q@ in the previous exercise. 


53. Use exercise 51 to find a formula for the sum 0-59 +1-5!42-52+.---+n-5", 


54. Generalize the formulas you found in the previous two exercises by finding the formula 
for the sum a+ 2a” +3a* +---+ na". Your final answer should be in terms of a and n. 


In exercises 55-58, we will determine all p(x) € C[x] that have the following properties: 
(i) p(a) ER, for all a € R, and 
(ii) p(B) ¢R, for all 6 € C such that 6 ¢R. 


55. Let F C L be fields, where both F and L are infinite sets. Suppose p(x) € L[x] has the 
property that p(a) € F, for all aw € F. Prove that all the coefficients of the polynomial 
p(x) must belong to the smaller field F. (Hint: Think about the ideas behind 
Proposition 13.8 and its proof.) 


56. (a) Let p(x) € R[y] have a positive leading coefficient. Show that there exists a 
real number 7 such that the polynomial p(x) + T has at most one real root and 
no multiple roots in R. (Hint: It might help to first look at this problem 
graphically.) 


(b) Use part (b) to show that every p(x) € R[x] has the property that there exists a real 
number T such that p(x) + T has at most one real root and no multiple roots in R. 


510 Chapter 13 


57. Let p(x) € R[x] have degree at least two. Use exercise 56 to show that there exist some 
a € C such that a ¢ R and p(a) ER. 


58. Use exercises 55 and 57 to find all p(x) € C[x] that have the properties described before 
exercise 55. 


13.3 Partial Fraction Decomposition 


We now turn our attention to the second main topic of this chapter: the partial fraction 
decomposition of rational functions. When studying techniques of integration in calculus, a 
typical problem we might run across is 


| 4x4 — 17x? — 6x? — 26x —17 
dx. 
x3 — 5x? +x-5 


4x4 —17x3 —6x2—26x-17 


ee , we would proceed as 


In an attempt to find a function whose derivative is 
follows: 


Step I Use the division algorithm to divide 4x4 — 17x* — 6x* — 26x — 17 by 
x —5x*-+x—5 to obtain a quotient of 4x +3 and a remainder of 5x” —9x —2. We 
can divide the equation 


Ax* — 17x? — 6x” — 26x —17 = (4x +3) (x? — 5x? +x —5) + (5x —9x —2), 
by x° —5x? +x —5 to obtain 


4x* — 17x3 — 6x? — 26x —17 5x? —9x—2 


— (4 3 >. 
x3 —5x2+x—5 wee es ORE 


Since 4x + 3 is easy to integrate, the problem now reduces to integrating 
Pes. Observe that whereas the degree of 4x4 — 17x° — 6x” — 26x — 17 
exceeds the degree of the denominator, the degree of 5x” — 9x — 2 is less than the 
degree of the denominator. This fact will be essential for performing the partial 


fraction decomposition. 


Step II Completely factor the denominator of ee into irreducible polynomials in 


R[x]. Recall that the Fundamental Theorem of Algebra tells us that the only 
irreducible polynomials in R[x] are all linear polynomials and those quadratic 
polynomials that have no real roots. We can then factor x* — 5x? +x —5 as 
(x—5)(x? +1). 


Step III 


Step IV 


Difference Functions and Partial Fractions 511 


a ee 
Write Sans as a sum of partial fraction using the denominators obtained in 


Step II. Therefore, we need to find real numbers a, b, c such that 


5x°—9x—2 ~ 8 eee 
=5242—-5 *£—5 2241 


Next, we multiply both sides of this equation by x* — 5x? +x —5 = (x—5)(x? +1) 
and obtain 

5x” —9x —2 =a(x* +1)+(bx +0) (x—5) = (ax* +a)+(bx*? + (—5b+0)x —5c) = 
(9) (a+b)x* +(—5b+c)x+(a—Sc). 


The only way the polynomials in (9) can be equal is if the coefficients of x’, 


x! = x, and x° = 1 are all equal. By setting the corresponding coefficients in (9) 


equal to each other, we obtain the following system of linear equations: 


a+b=5 
—5b+c=-9 
a—S5c=—-2. 


The solution to this system of equations is a=3, b= 2, c = 1. Thus 


Slee 42rd 
—5x24+x-5 x-5 x241° 


41953 _6y2_96 y_ 
d 17x? —6x*—26x IT 4 


Having decompose S 


x3 —5x24+x—5 
Pee ONT a hee 5x?-9x-2 
x3 —5x?+x—-—5 es x3—5x24+x-5 
3 2x+1 
4x +3) + — +.——_., 
ace rer ee re 


we now have 


| eae 
x3 — 5x? +x-5 


3 2x+1 
[ars av+ f ax+ | dx = 


2x7 +3x+3In|x—5| +In|x*+1| + arctan(x) + C. 


512 Chapter 13 


Recall that there are two types of partial fractions. The simpler type is of the form Gaetby 
where a, m, b € R,n €N, and both a and m are nonzero. Examples of this type of partial 


fraction are 


7 VJ 35 a 
, , , and ———_. 
x—-2° x3" 4x-3 (2x+7)8 


ax+B 
(ax2-+bx+c)"’ 
at least one of a, 6 is nonzero, and the quadratic ax” + bx +c has no real roots. Examples of 


this type of partial fraction are 


The second type of partial fraction is of the form where a, 6,a,b,c € R,n EN, 


3x—8 af? llx+76 xr—7 
’ ; > an ‘ 
x24+4’ (2x741)4?) x?4+1 (3x2 + 7x +55)? 


Observe that whereas ae is a partial fraction, 


the denominator does have real roots. 


3x-8 
x24 


is not a partial fraction as the quadratic in 


The partial fraction decomposition of rational functions asserts that every rational function 
can be written as the sum of a polynomial and partial fractions. When we look back at the 
previous example, we see that Step IV is entirely based on calculus and takes place after 


: 4x4—17x3 6x? —26x-17 
having already decomposed Sees : 


ee 4_ 177362 
The actual decomposition of #—!7=$* =26x—17 takes place in Steps I, II, and III. Observe 
x? —5x*+x—5 


that Step I is a straightforward application of the division algorithm for polynomials. Step II is 
the version of the Fundamental Theorem of Algebra, which states that all irreducible 
polynomials in R[x] must have degree 1 or 2. Thus, the real issue is why does Step III work? 
More precisely, why does setting things up the way we do in Step III always result in a system 
of linear equations that has a solution? 


It turns out that the answer lies in properties of polynomials that can be derived from the 
division algorithm. In order to prove this, we will once again examine and exploit the 
similarities between the positive integers and polynomials with coefficients in a field. We 
begin with a lemma about rational numbers which follows from the Euclidean Algorithm. 


Lemma 13.10. Let . be a nonzero rational number, where b = d\dz--+d,, such that each d; 
is a positive integer with d; and dj; relatively prime whenever i # j. Then 


Ey pclae eae ee 
be di d dy. 


where A,C1,C2,...; Cn € Zand 0 < c; < dj, for alli. 


Difference Functions and Partial Fractions 513 


Intuition. In some sense, Lemma 13.10 undoes the addition and subtraction of fractions. 
When we add and subtract fractions, we first obtain a common denominator as in 


4 5 36435 71 


779 63 = 63 


However, Lemma 13.10 asserts that we can do the opposite. For example, it says that ae can 
be decomposed as 


BO as Sys 
63 7 9? 


where A,c},c2 € Z,O0 < cy <7, and0 <c2 <9. 


Let’s see how to perform this decomposition. Since 7 and 9 are relatively prime, the division 
algorithm tells us that there exist a, @2 € Z such that 


1=9a,+7ap. 


There are many possible choices for a1, @2. One such choice is a; = —3 and a2 = 4, which 
gives us 


1=9(—3)+7(4). 
Multiplying both sides by 451 yields 
451 = 9(—1353) + 7(1804). 


Then dividing both sides by 63 gives us 
451 —1353 1804 


ef ge ao 


Observe that neither — 1353 nor 1804 lies within the proper range. However the division 
algorithm tells us that 


—1353 = (—194)-7+5 and 1804 = (200)-9+4. 


Therefore we can replace = by —194+ 3 and also replace “ by 200+ 5. AS a result, we 


now have 
451 —1353 1804 5 4 


63 7 9 
5 4 


5 4 
~19442 = 
(-194+200) +> + 5 =6+54+5 


Thus, A = 6, cj = 5, and cz = 4 yields the desired decomposition. In this example, 63 was 
factored into the product of two relative prime numbers. More generally, if the number of 


514 Chapter 13 


relatively prime factors exceeds two, we can prove that an appropriate decomposition exists 
by applying Mathematical Induction to the number of factors of the denominator. 


We can now prove Lemma 13.10. 


Proof. We will proceed by Mathematical Induction and begin by letting 


T = {n € N|a decomposition exists whenever the denominator 


is a product of n factors that are relatively prime to each other}. 


We must first show that | € T. Therefore, we will examine the rational number a where 
b= d. If use the division algorithm to divide a by d; and then let A denote the quotient and c; 
the remainder, we obtain 


a=A-d\+c1, 


where A, cy € Zand 0 < c; < d;. Dividing this equation by b = d, results in 


Thus, we have decomposed ; and so, 1 € T. 


To conclude the proof, it now suffices to show that if a natural number k belongs to T, then 

so does k+ 1. We need to examine the rational number a where b = djd>-- + dgdg41, such that 
d; and d; are relative prime whenever i 4 j. Let e = did) --- dx; observe that e and d;+, are 
relative prime. Therefore, the Euclidean Algorithm implies that there exist a1, a2 € Z 

such that 


L=a1-dk+1t+a2-e. 
If we multiply this equation by a and let 6; = aa, and Bz = az, we obtain 
a = (aa) -dy41 + (a2) -e = By -dyy1 + Bo-e. 


Since b = e- dx+1, dividing this equation by b yields 


a_ Bi, b 
10 Soe 
( ) b e - A+ 
Using the fact that k €¢ T, we can decompose A. as 
Bi cy 2 Ck 
11 — SS Ad sh a ee 
— e | dd dk 


Difference Functions and Partial Fractions 515 


where Aj, C],Co,...,c, € Zand 0 < c; < d;, for all i. Furthermore, since 1 € T, we can rewrite 
Bo 
dk as 
Bo Ck+1 
aia Ad =f 8 
dst dks} 


where Ao, cy41 € Z and 0 < cg41 < dy41. If we substitute both equation (11) and the previous 
equation back into equation (10) and then let A = A; + Az, we obtain 


a_ py Bo cj, 2 Ck Ck+1 
Bee et ee Bi IE ae fl As Meet A a a a Bes pe SE 
b ease ( ae iar i #) + ( y+ St) 
Cl c2 Ck Ck+1 Cl c2 Ck Ck+1 
(Aira) = Ach Re 
di, dy dy deat di dy de dy 
Since A, cy, C2,..., Ck41 € Zand 0 < c; < dj, for all i, we have successfully decomposed Ff 


Thus, k+ 1 € T, thereby concluding the proof. 


The only tools used to prove Lemma 13.10 are the division algorithm, the Euclidean 
Algorithm, and Mathematical Induction. In light of this, it should come as no surprise that by 
using the division algorithm and Euclidean Algorithm for polynomials over a field, we can 
easily adapt the proof of Lemma 13.10 to prove an analogous result for decomposing rational 
functions. We will keep the notation in Lemma 13.11 as close as possible to the notation in 
Lemma 13.10 to make the similarities between positive integers and polynomials as 
transparent as possible. 


Lemma 13.11. Let F be a field and let a(x), b(x) be nonzero elements of the polynomial ring 
F[x]. Suppose b(x) = d(x)d2(x) ---dy,(x), where each d;(x) € F[x] has degree at least one 
and dj(x), d;(x) are relatively prime whenever i # j. Then 


a(x) ci(x) | c2(x) Cn(x) 

= A(x) + ie . 

D(x) d(x) da(x) dy (x) 
where A(x), c1(X), €2(X),..., n(x) € F[x] and, for all i, either c;(x) = 0 or has degree less 
than the degree of dj(x). 


Proof. All the key ideas of this proof are contained in the proof of Lemma 13.10. It will 
simply be a matter of making some minor modifications. However, it is worth noting that 
although our ultimate goal is to work with polynomials in R[x], this lemma does hold for 
polynomials over any field. 


We will proceed by Mathematical Induction and begin by letting 


T = {n € N|a decomposition exists whenever 


b(x) is a product of n factors that are relatively prime to each other}. 


516 Chapter 13 


We must first show that 1 ¢ 7. Therefore, we begin with coe where b(x) = d;(x). Using the 
division algorithm in F[x] to divide a(x) by d(x) and then letting A(x) denote the quotient 


and c;(x) the remainder, we obtain 
a(x) = A(x) - di (x) +1 (x), 


where A(x), c1(x) € F[x] and either c;(x) = 0 or c;(x) has smaller degree than dj (x). 
Dividing this equation by b(x) = d)(x) results in 


a(x cy(x 

(x) ei 1(x) 

D(x) dj (x) 
Thus, we have decomposed oH and so, 1 € T. 


It now suffices to show that if k belongs to 7, then so does k + 1. We need to examine ae, 


where b(x) = d)(x)dz(x) «++ dx(x)dx41(x), where d;(x), d;(x) are relative prime whenever 

i j. Let e(x) = di (x)do(x) --- d(x); observe that e(x) and dx41(x) are relative prime in Fx]. 
Therefore, the Euclidean Algorithm in F[x] implies that there exist a(x), w(x) € F[x] such 
that 


1 = a(x) - dx4i1 (x) +2 (x) - e(x). 
If we multiply this equation by a(x) and let 6B; (x) = a(x)a,(x) and Bo(x) = aa2(x), we obtain 
a(x) = Bi (x) «dis (x) + a(x) -e(). 


Since b(x) = e(x)-dg+1(x), dividing this equation by b(x) yields 
a(x) _ Bi (x) ‘e Bo(x) 


(12) = : 
D(x) e(x) dai (x) 
However, k € T, therefore we can decompose A ot as 
Bi (x) ci(X) | c2(%) cK(X) 
(13) = Aj(x)+ ai itd 
ex) | A) hh) kG) 
where A,(x), c)(X), C2(X),..., cK(x) € F[x] and, for all i, either c;(x) = 0 or c;(x) has smaller 
degree than d;(x). Furthermore, since 1 € T, we can rewrite pe as 
x Cepi(x 
Bo(x) ayia! oe 
dy+1(x) dy41(X) 


Difference Functions and Partial Fractions 517 


where Ao(x), cx41(x) € Fx] and cg41 (x) = 0 or cg+1(x) has smaller degree than dz) (x). 
Substituting both equation (13) and the previous equation back into equation (12) and then 
letting A(x) = A, (x) + A2(x), we obtain 


ax) BiG) Be 
ba) ex) | dei) 


(4109+ 29 4209 4. A) + (data) + SH) 


d(x) d(x) ) devi (0) 
cy(x)  c2(x) CK(X)  Ceyt(X) 
A ae : 
OCT RG Bay ae Beate) 
Since A(x), c1(%), co(x),..., Ce¢1(%) € FLX] and, for all i, either c;(x) = 0 or c;(x) has smaller 


atx) 


degree than d;(x), we have successfully decomposed BG) Thus, k+1 € T, thereby concluding 
the proof 


The next big piece of the puzzle relies on a fact about positive integers that you have known 
since grade school but whose analog for polynomials might well be unfamiliar to you. Almost 
every computation you have ever seen with positive integers has probably taken place in base 
10. Recall that in base 10, the real meaning of the number 493 is 


4-10°+9-10!+3-10°. 


In computer science, frequent use is made of base 2 as well as base 16. However, any positive 
integer greater than 1 is a perfectly valid base for representing the positive integers. For 
example, since 


493 =1-2564+1-128+1-644+1-324+1-84+1-44+1-l= 
[oD ey 201 0 ee ee I oa 


the representation of 493 in base 2 is 111101101. Similarly, if we wish to look at 493 in base 5, 
then we have 


493 =3-1254+-4-254+3-543-1=3-5944.5743.5'43.5°, 
Thus, in base 5, the representation of 493 is 3433. 


Although we may not think of it this way, when we write polynomials we are expressing them 
using a base of x. After all, the polynomial 6x* — 7x” — 5x +40 really means 6- x + (—7)-x?+ 
(—5)-x!+40-x°. Just as any positive integer can be used as a base to represent the positive 
integers, an analogous fact holds for polynomials. In particular, any polynomial of degree at 
least one can be used as a base to represent all polynomials. This idea is probably less foreign 
to you than realize. In calculus, you represented many functions by Taylor polynomials. If we 


518 Chapter 13 


look at the Taylor polynomial for 6x? — 7x? — 5x +40 centered at x = 1, we obtain 
34 —(x—1)+11(x— 1)? + 6(x — 1)’. Reordering terms, we can rewrite this as 


60 37x = 5x 4-40 = 626217 Filo 1)? Fel) 1 494-619, 


Thus, we have written the polynomial 6x* — 7x” — 5x +40 using the base x — 1. Similarly, if 
we want to write 6x° — 7x” — 5x +40 using the base x +5, we would find the Taylor 
polynomial for 6x° — 7x” — 5x +40 centered at —5. We would then obtain 


6x? — 7x? —5x +40 =6- (x +5)? + (—97) - (x +5)* +515 - (x +5)! + (—860) - (x +5)°. 


At first, it seems unclear whether we can also represent polynomials using any base that has 
degree greater than one. If we begin to experiment, we see that we can represent 
6x? — 7x? — 5x +40 using x? as the base as 


6x? — 7x7 —5x+40 = (6x —7)- (x7)! + (—5x +40) - (x). 


Similarly, if we wanted to represent 6x* — 7x* — 5x +40 using x” +1 as the base, 
experimentation would eventually tell us that 


6x9 = 9x" — 5x 4-40 = 6x = 7) 2 $1) (11 447) 2 1. 


Observe that when we write a number in base 10, the allowable nonzero coefficients of the 
powers of 10 are the positive integers less than 10. Similarly, in base 5, the allowable nonzero 
coefficients of the powers of 5 are the positive integers less than 5. When we look at the 
analogous situation for polynomials, when one uses a base of x or x — 1 or x +5, one would 
expect that the allowable nonzero coefficients are the nonzero polynomials of degree less than 
one. Similarly, when using a base of x? or x* +1, one would expect that the allowable nonzero 
coefficients are the nonzero polynomials of degree less than two. 


At this point it is reasonable to conjecture that when trying to represent all polynomials using 
the polynomial p(x) as a base, then the allowable nonzero coefficients are all polynomials 
whose degrees are less than the degree of p(x). However, you might be wondering how we 
find the coefficients of the powers of p(x)”. If you fully understand the situation for positive 
integers, then you should not be too surprised that we use the division algorithm to find the 
coefficients. We express this more formally in 


Lemma 13.12. Let F be a field and let p(x) € F(x] have degree m > 1. Then every nonzero 
element of F [x] can be expressed using p(x) as the base. More precisely, if a(x) is a nonzero 
element of F [x], then there exists t > 0 and bo(x), bi(x),..., b;(x) € F[x] such that 


a(x) = by (x) + p(x)! +.by-1 (x) - p(y) ++ £1 (x): p(x)! + box) - px)”, 


where each b;(x) is either 0 or has degree less than m. 


Difference Functions and Partial Fractions 519 


Proof. Given p(x), we will proceed using the Second Version of Mathematical Induction. To 
this end, we let 


T = {n € Nolevery element of F[x] of degree n can be written using 


P(x) as the base}. 


First, we need to show that 0 € 7. However, if a(x) has degree 0, then by letting t = 0 and 
bo(x) = a(x), we have 


a(x) = a(x)-1 = bo(x)- p(x)”. 


Since bo(x) has smaller degree than p(x), we have written a(x) using p(x) as the base. Thus, 
0 € T, as desired. 


Next, let us suppose that k > 0 has the property that {0,1,...,k} C 7. It suffices to show that 
k+1€T, so we may now assume that a(x) € F[x] has degree k+ 1; If k+1 <m, then we can 
represent a(x) using p(x) as the base by letting bo(x) = a(x) to give us 


a(x) = a(x)-1 = bo(x)- p(x)”, 
as desired. 


Therefore, for the remainder of the proof, we may assume that k + 1 > m. Using the division 
algorithm in Z, we can divide k+1 by m. Letting t denote the quotient, we obtain 


k+l=tm-r, 


where 0 < r < m. Now, if we use the division algorithm in F[x] to divide a(x) by p(x)’ and let 
b;(x) denote the quotient, we have 


(14) a(x) = by(x)- p(x) +r(x), 
where r(x) = 0 or has degree less than the degree of p(x)’. 


Since a(x) has degree k + 1 and p(x)‘ has degree tm, it follows the degree of b,(x) is 
(k+1)—tm =r <m. Thus, b,(x) has smaller degree than p(x). If r(x) = 0, then 


a(x) = b;,(x)- p(x) 


is a representation of a(x) using p(x) as a base. On the other hand, if r(x) 4 0, then the degree 
of r(x) is less than k+ 1. Therefore, the degree of r(x) belongs to the set {0,1,..., k}, which is 
a subset of T. As a result, r(x) can be written using p(x) as a base. Furthermore, since the 

degree of r(x) is also less than tm, it follows that whenever a term of the form p(x)i appears in 
the representation of r(x), the exponent j must be less than t. Therefore, r(x) can be written as 


r(x) = by (x) + p(x)! ++ + bi (x): p(x)! + B0(x) « p(X)®, 


520 Chapter 13 


where each b;(x) is either 0 or has degree less than m. Substituting this representation for r(x) 
into equation (14), we obtain 


a(x) = by(x)« p(x)! +.By-1 (x) > px) +++ +.B (x) + p(x)! + do(x)- pa)”, 


where each b;(x) is either 0 or has degree less than m. Therefore, a(x) can be represented 
using p(x) as a base, sok +1 € T, as desired. 


Recall that the primary goal of this section is to prove the validity of the partial fraction 
decomposition technique that is shown but not proven in calculus. That result will easily 
follow from the main result of this section which we now have all the pieces to prove. 


Theorem 13.13. Let F be a field and let a(x), b(x) be nonzero elements of the polynomial 
ring F(x]. Suppose b(x) = p(x)! po(x)""? +++ py (x) where each p;(x) € F[x] has degree at 
least one, each m; = 1, and p;(x), pj(x) are relatively prime whenever i # j. Then 


a(x) by1(x) by my (x) bn 1 (x) baw, (x) 
cai ome = oe cut 
b(x) ONE: pi(x)! p(x)” Pn (x)! Pn(x)™" 


where A(x), bj, ;(x) € F[x] and, for alli, j, either bj, ;(x) = 0 or has degree less than the degree 
of pi(x). 


Proof. For every 1 <i <n, let dj(x) = p;(x)". Then each d;(x) has degree at least one and 
d;(x), dj(x) are relatively prime whenever i  j. Therefore, we can apply Lemma 13.11 to 
assert that 


(15) a(x) A(x) + ci) | c2%) Cn) 
D(x) d\(x)  dy(x) dy (x) 
where A(x), c1 (x), Co(X), ..-, Cn(x) € F[x] and, for all i, either c;(x) = 0 or has degree less 
than the degree of d;(x). 
To conclude the proof, it remains to show that for every 1 < i <n, we decompose c R as 
Ci) BiG) |, Dim @) 
d(x) p;(x)! ae Maal 


where, for every j, b;,;(x) = 0 or has degree less than the degree of p;(x). In light of 
Lemma 13.12, we can write cj(x) using p;(x) as the base. As in the proof of Lemma 13.12, 
since the degree of c;(x) is less than the degree of p;(x)’’, no term in the representation of 
cj(x) appears where the exponent of p;(x) exceeds m; — 1. Therefore, 


cj(x) = Bi (x) = pix! + + Dim; (20) « pi)”, 


where, for each j, either b;, ;(x) = 0 or has degree less than the degree of p;(x). 


Difference Functions and Partial Fractions 521 


Dividing the previous equation by d;(x) = p;(x) yields 


ci (xX) bi, 1(X) bin; (x) 
(16) = pare mack 
di(x) p(x) pilx)™" 
Finally, for every 1 <i <n, replace a in equation (15) by the decomposition in 
equation (16), to obtain 
a(x by (x b x by (x b x 
( Ys Ae 1,1¢ baal igen) cae 2,1 ( i rig) 
b(x) Pi(x) pix) Pn (x) Pn(xy"™ 


where A(x), bj, ;(x) € F[x] and, for all 7, 7, either b;, ;(x) = 0 or has degree less than the degree 


of p;(x). Thus, a can indeed be decomposed in the desired manner. 


We can now easily prove the validity of the partial fraction decomposition technique in 
calculus by specializing Theorem 13.13 to the case where all our polynomials belong to R[x]. 


Corollary 13.14. Let a(x), b(x) € R[x], where a(x) 4 0 and b(x) has degree at least one. 
Suppose 


b(x) = py (x)! po(x) ++ ps (x) qu (x)"" qa (x) - ++ ge(x)™ 


is a factorization of b(x) into irreducible polynomials in R[x], where each p;(x) has degree 
one, each qj(x) has degree two, and every m;,nj; = 1. Then there exist A(x) € IR[x] and 
i,j» bi, j, Ci, € R such that 


a(x) a1,1 a1 ,m, ds,1 asm 
—— = A(x)+ : er : a 2 ee eo se 
b(x) pi(x)! pi(x)™! pay Ps(x)" 
by ix+c11 ae Din X+ C1 ny _ by jxt+Cr1 _ bin, X + C1; 
q(x)! q(x)" qi(x)! q(x)” 


Furthermore, A(x) = 0 if and only if the degree of b(x) exceeds the degree of a(x). 


The proof of Corollary 13.14 follows easily from Theorem 13.13. However, it is easy to get 
lost in a sea of notation. So, before proving the corollary, let us examine what it says about a 
specific example. 


Example 


Consider the rational function 


9x4 — 5x7 +24 —6 
x2(2x — 1)3(5x +2)! (x2 + 1)3(2x? +3)! (x2 + 3x +17)2° 


522 Chapter 13 


Since the denominator has greater degree than the numerator, Corollary 13.14 asserts 
there exist dj, ;, bj, ;, ci; € IR such that 


9x4 — 5x* +2x —6 
x2 (2x — 1)3(5x+2)!(02 + 132x243) 243x417)? 
a1, a12 42,1 2,2 42,3 43,1 
xi x? Ox—-1D!  Ox—12  Gx—-1? (5x+2)! 
Ditch. bisx-ci9 4 Disks . Oxaxca4 
(x2+1)! (x2 +1)? (x2 +1)3 (2x? + 3)! 
b31x +031 b3,2x+ 3,2 
(x2+3x4+17)! (x?43x+17)2" 


It is a rather daunting and time-consuming task to actually find the 18 real numbers 
that appear in the numerators of the right-hand side of the previous equation. Most 
likely, you would begin by multiplying both sides of the equation by x7(2x — 1)3(5x+2)! 
(x? + 1)3(2x? +3)! (x? + 3x +17)”. This yields an equality of polynomials where the 
largest exponent appearing is 17. Comparing the coefficients of x°, x!,...,x!7 yields 18 
linear equations in 18 unknowns. In general, a system of linear equations need not have a 
solution. But Corollary 13.14, which relied on the division algorithm and Euclidean 
Algorithm in R[x], guarantees that there is indeed a solution. 


Proof of Corollary 13.14. Theorem 13.13 provides us with a decomposition of fee but we 


need to show that this decomposition is of the form described in Corollary 13.14 Since each 
pi(x) in the factorization of b(x) is linear, every term in the decomposition of Ta that has a 
power of p;(x) in the denominator must have a real number in the numerator. Similarly, since 
every qj(x) in the factorization of b(x) is a quadratic, every term in the decomposition of ae 
that has a power of g;(x) in the denominator must have a numerator of the form ax + 6, where 
a, B € R. Thus, in our situation, the decomposition described in Theorem 13.13 has the 


desired form. 


Finally, in the decomposition described in this corollary, let 


g(x) = Ato og Gem gg St Gs 
pix)! pi(x)”! ps(x)! Ps(x)"s 
biixtery | | Pimx+eim baxter | Pem¥ + Cin 
qi(x)! qu(x)” q(x)! g(x)” 
Therefore, 
a(x) 
= = A + ; 
BG) (x) + g(x) 


www-.elsevierdirect.com 


Difference Functions and Partial Fractions 523 


and if we multiply both sides of this equation by b(x), we obtain 
a(x) = b(x)- A(x) + b@)- g(x). 


Observe that when we multiply g(x) by b(x), we obtain a sum of polynomials, all of which 
have degree less than the degree of b(x). If A(x) = 0, then the degree of the right-hand side of 
the previous equation is smaller than the degree of b(x). Thus, the degree of a(x) is smaller 
than the degree of b(x). Hence, in this case, the degree of b(x) exceeds the degree of a(x). 


On the other hand, if A(x) 4 0, then the degree of the right-hand side of the previous equation 
is at least as large as the degree of b(x). Hence, the degree of a(x) is at least as large as the 
degree of b(x). Thus, in this case, the degree of b(x) does not exceed the degree of a(x). 
Combining these observations, we can see that A(x) = 0 if and only if the degree of b(x) 
exceeds the degree of a(x). 


Exercises for Section 13.3 


1. Write the number 315 in base 2. 


2. Write the number 315 in base 3. 

3. Write the number 315 in base 5. 

4. Write the number 315 in base 8. 

5. Write the number 721 in base 2. 

6. Write the number 721 in base 3. 

7. Write the number 721 in base 5. 

8. Write the number 721 in base 8. 

9. Write the number 2009 in base 2. 
10. Write the number 2009 in base 3. 
11. Write the number 2009 in base 5. 
12. Write the number 2009 in base 8. 
13. Write the polynomial 7x? + 4x7 + 9x — 6 using x — 2 as the base. 
14. Write the polynomial 7x? + 4x? + 9x — 6 using x +5 as the base. 
15. Write the polynomial 7x? + 4x? + 9x — 6 using x” as the base. 
16. Write the polynomial 7x? + 4x? + 9x — 6 using x* +4 as the base. 
17. Write the polynomial 2x° — 3x++ 6x +3 using x+ 1 as the base. 


524 Chapter 13 


18. 
19. 
20. 
21. 
22. 
23. 


24. 


25. 


26. 


27. 
28. 
29. 


30. 


31. 


32. 


Write the polynomial 2x° — 3x++6x +3 using x —8 as the base. 
Write the polynomial 2x° — 3x++6x +3 using x? as the base. 
Write the polynomial 2x° — 3x4 +6x +3 using x? + 1 as the base. 
Write the polynomial 2x° — 3x++6x +3 using x° as the base. 
Write the polynomial 2x° — 3x+ + 6x +3 using x° +5 as the base. 


Write the polynomial 3x!! — 5x° + 2x8 — 14x° — x4+7x° — 11x? +5x+1 using x? as the 
base. 


Write the polynomial 3x!! — 5x? + 2x8 — 14x° — x4 +7x3 — 11x? +5x+1 using x? as the 
base. 


Write the polynomial 3x!! — 5x° + 2x8 — 14x° — x4+4+7x3 — 11x? +5x+1 using x* as the 
base. 


Write the polynomial 3x!! — 5x° + 2x8 — 14x° — x4+7x3 — 11x7+5x+1 using x° as the 
base. 


Find integers A, B, C such that 7 =A+4 +, whereO < B<3,0<C <8. 


Find integers A, B, C such that tl =A+8 gor £, where 0 < B<1,0<C < 24. 


Find integers A, B, C, D such that >t =A+ 8 + £ + 2, where 
0<B<1,0<C<2,0<D<4. 


4397 
Find integers A, B, C, D such that => = A+ 7 By 5 ee =, where 


0<B<3,0<C<24,0<D<6. 


(a) Find integers A, B such that x = 4 + 2 (Many different answers are possible.) 


(b) Suppose you have a large bowl and two small cups that hold 5 and 5 cups of water 
each, Feabeenvel: Explain how you could use the bowl and cups so the bowl will 
contain exactly 4 of a cup of water. 


(a) Find integers A, B such that 2 oo = 4 + 4. (Many different answers are possible.) 


(b) Suppose you have a large bowl and two small cups that hold i and “ cups of water 
each, respectively. Explain how you could use the bowl and cups so the bow] will 
contain exactly a of a cup of water. 


In exercises 33-42, decompose the rational function into a sum of a polynomial and rational 
functions in R[x] as described in Corollary 13.14. 


33. 


—x+10 
(x—1)(x+2) 


Difference Functions and Partial Fractions 525 
= 431x?+7x-19 
34. (x1)? (+2)? 


3x3 +15x24+13x+20 
35. aa a 


4x3 +2x?—13x+10 
36. 205) 


2x?—5x—3 
37. (x+)Q?+1) 


38 13x39 +37x2+64x4+5 
. (x?-+4) (x+1)? 


39 4x4 —10x3+40x2—91x+63 
; x(x249) 


40 16x°—3x44287x> —68x2+1285x—405 
5 x(x2+9)2 


41 10x? +36x2+19x+81 
: (x2-+ 1) (x?4+4) 


42 4x° —28x44+84x3 —786x2—676x—476 
, (x+1)(x—8) (x? +25) 


In exercises 43-46, let F(x) = Se. 


43. Find aj, a2, 43, a4 € Q such that F(x) = ae fe Came Gd, 


x241 
44, Find by, by, b3, by € R such that F(x) = ag t ST. 
45. Find c1, 2,3, c4 € Q(i) such that F(x) = 47 + 3 4+ S 


xt+i° 


46. Find d), dz, ds, d4 € C such that F(x) = 45+ B+ + 


In exercises 47-50, let G(x) = Se 


47. Find a1, a2,03,a4 € Q such that G(x) = a + ea 


x2+4 
48. Find bi, bz, b3, bg € R such that G(x) = 5 aie “a +n. 
49. Find cy, c2,¢3, cq € Q(i) such that G(x) = SP + 5 + ie 
50. Find di, dp, ds, dy € C such that G(x) = 445 + 4, tee ee 
In exercises 51-54, let H(x) = aoe. 
51. Find aj, qa, a3,a4€ Q such that A(x) = a + or : 
52. Find by, bo, b3, ba € R such that H(x) = a += a - Pats 


53. Find cy, c2, c3, c4 € Q(i) such that H(x) = S42 + + oS. 


526 Chapter 13 


54. 
55. 


56. 


Find dj, dy, d3, d4 € C such that H(x) = a + aa 4 — a 4. 


Show that the decomposition in Theorem 13.13 is unique. This requires showing that no 
other A(x), bj, ;(x) € Fx] are possible subject to the condition that for all 7, j, either 
b;, ;(x) = 0 or has degree less than the degree of p;(x). 


Show that the partial fraction decomposition in Corollary 13.14 is unique. This requires 
showing that no other A(x) € R[x] and qj, ;, bj, ;, c;,; € IR are possible. 


[CHAPTER 14 


An Introduction to Linear Algebra and 
Vector Spaces 


In the next few chapters, we will often look at “chains” of fields such as 


RCC,QCQ(v2), and QC Q(V3) € Q(V3, i). 


In order to better understand these chains, we need to introduce a new concept that deals with 
the “relative” size of a field compared to a smaller one. Since all the fields in these chains are 
infinite sets, it is not at all clear what this new concept means. To develop an understanding 
of this concept, we will need to look at many examples and introduce some new ideas and 
terms. 


14.1 Examples, Examples, Examples, and a Definition 


We begin this section with 


m Examples 


1. RCC; every a € C can be written as a= a+ bi, where a, b € R. Therefore, every 
element of C can be described using two elements of IR. So, in some sense, we can 
think of C as being two times as large as R. 


2. QC Q(v2); every @ € Q(v2) can be written as w= a+bvV2, where a,b € Q. Since 
every element of Q(V2) can be described using two elements of Q, we can think of 
Q(V2) as being two times as large as Q. 


3. QC Q(V3) Cc Q(Vv3, i); using this chain of fields, there are three chances for us to 
compare the relative sizes of fields, as we can compare Q(V3, i) to Q, Q(V3) to Q, 


Copyright © 2010 by Elsevier Inc. All rights reserved. 52 7 


528 


Chapter 14 
and Q(v3, i) to Q(V3). Every @ € Q(v3, i) can be written as 


a=at+bV34+cit+diVv3 = (a+bv3) + (c+dv3)i, 


where a, b,c, d € Q. Therefore, it takes four elements of Q to describe elements of 
Q(V3, i) and two elements of Q to describe elements of Q(V3). Finally, the two 
elements a+ bvV3 and c+dV3 of Q(V3) can be used to describe elements of 
Q(V3, i). Therefore, we can think of Q(V3, i) as being four times as large as Q 
and two times as large as Q(V3). We can also think of Q(V3) as being two times 
as large as Q. 


As we develop and formalize the idea of relative size, we are led to concept of dimension. 


Since dimension is the most important concept in linear algebra, it will be the primary focus of 


this chapter. 


Throughout this chapter, we will frequently refer to sets such as the polynomials in Q[x] of 
degree less than 3 or the polynomials in R[x] of degree less than 5. Technically, neither of 

these sets contains the polynomial f(x) = 0, since we did not assign a degree to this particular 
polynomial. However, for every n € N, we will adopt the convention that the polynomial 


fx) = 


0 is considered to belong to the set of polynomials of degree less than n. 


Let us begin by looking at two large collections of examples. In each collection, we will try to 
find algebraic properties that are common to all of the examples. 


1. 


2 
3% 
4 


nn 


mo NI SD 


m= Examples—Collection | 


All polynomials in Q[x] of degree less than 2 = {a+ bx|a,b € Q} 
All polynomials in Q[x] of degree less than 3 = {a+bx+ cx? |a,b,c €Q} 
Qi] 


All polynomials in Q[x] of degree less than 3 with 0 constant term 
= {ax + bx? |a,b € Q} 


All ordered pairs of elements of Q = {(a, b) |a,b € Q} 

All ordered four-tuples of elements of Q = {(a, b, c,d) |a,b,c,d € Q} 
Q@ = {a+bila,be Q} 

Q(v2) = {a+ bv2|a,b € Q} 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 529 


9. All ordered pairs of elements of Q such that the sum of the components is 
0 = {(a, b) |a,b€ Qanda+b=0} 


10. All four-tuples of elements of Q such that the first component plus twice the second 
component plus five times the third component minus three times the fourth 
component is 0 = {(a, b,c, d) |a,b,c,d €Q anda+2b+5c—3d =0} 


When we look at examples 1, 2, 3, 4, 7, and 8, it should be clear that given any two elements 

from one of these sets, these elements can be added to produce a third element of the set. For 
example, 2 —5x+ 8x? and 7 +2x— 11x? are typical elements from the set in 2 and their sum, 
9 — 3x — 3x’, is also an element of the set in 2. For a more general example, if a,x + bx” and 
dx + byx* are elements from the set in 4, then their sum, (a; + a2)x + (bj +b2)x?, is also an 

element from the set in 4. 


To deal with examples 5, 6, 9, and 10, we must review addition in these sets. Given ordered 
pairs, we add them componentwise and 


(a, b) + (c,d) = (a+c,b+d). 
More generally, given ordered n-tuples, where n € N, we again add componentwise and 
(41, 42,...,Gn) + (1, ba, ..., bn) = (41 +1, G2 + bo, ...,Gn + dn). 


Having said this, it should now be easy to see that we can always add elements in examples 
5 and 6. However, the situation for examples 9 and 10 is a little more complicated. 


In example 9, we can always add two ordered pairs and obtain a third ordered pair. However, 
we need to make sure that the element we obtained satisfies the extra condition that the sum of 
the two components is 0. To check this, suppose (a, b) and (c, d) are two elements from 
example 9. Thus, 


(1) a+b=Oandc+d=0. 


Adding our two elements we obtain (a+c, b+). To check if the sum of the two components 
in this ordered pair is 0, we use the information from equation (1) to obtain 


(a+c)+(b+d) =(at+b)+ (c+d)=0+0=0, 


as desired. Thus the sum of any two elements from the set in example (9) remains in that set. 


530 Chapter 14 


In example 10, suppose (a1, bj, c1, d,) and (a2, bz, cz, dz) are two elements from this example. 
Then 


(2) ay+2b, +5c; — 3d, = 0 and a2 +2b2 +5c2 — 3d) = 0. 


Adding these two elements, we obtain (a; + a2, bj + b2, c) +€2, d; + dz). To check if this 
element belongs to the set in 10, we need to check that the first component plus twice the 
second component plus five times the third component minus three times the fourth 
component is 0. Using the information in equation (2), we obtain 


(ay +42) +2 (b1 +b2) +5 (c1 +2) —3 (di +d) = 
(ay +2b, + 5c; — 3d,) + (ag + 2b2 + 5c2 — 3dz) = 
0+0=0, 


as desired. As a result, whenever we add elements from the set in example 10, we remain in 
that set. 


It is not hard to see that addition in all ten of our examples is associative and commutative. 
Also note that every example has an additive identity element. In examples 1-4, the 
polynomial f(x) = 0 is the additive identity. This helps to illustrate why it is convenient to 
consider f(x) = 0 to be an element of the set of polynomials of degree less than n, regardless 
of the value of n. For examples 5 and 9, the ordered pair (0, 0) is the additive identity and, in 
examples 6 and 10, the ordered 4-tuple (0, 0, 0, 0) is the additive identity. In these examples, 
not only is it important that (0, 0) and (0, 0, 0, 0) behave like an identity, but it is also 
important that they belong to the given set. So, in example 10, before we can state that 

(0, 0, 0, 0) is the additive identity, we first need to convince ourselves that it satisfies the 
condition that the first component plus twice the second component plus five times the third 
component minus three times the fourth component is 0. Finally, in examples 7 and 8, the 
number 0 is the additive identity. 


Furthermore, in each example, it is also not hard to see that every element has an additive 
inverse. In particular, in example 2, the additive inverse of a+ bx+ cx is —a — bx — cx”, and 
in example 10, the additive inverse of (a, b, c, d) is (—a, —b, —c, —d). At this point, we need 
to make a comment similar to the preceding one about checking that an element belongs to the 
set before declaring that it is the additive identity. In example 10, before we can say 

(—a, —b, —c, —d) is the additive inverse of (a, b, c, d), we first need to convince ourselves 
that (—a, —b, —c, —d) also satisfies the condition that the first component plus twice the 
second component plus five times the third component minus three times the fourth 
component is 0. Therefore, as you go about convincing yourself that every element has an 
additive inverse in the other eight examples, remember that the additive inverse needs to be an 
element of the set in the example. 


An Introduction to Linear Algebra and Vector Spaces 531 


To summarize, all ten of our examples are groups under addition and the addition is 
commutative. We can now examine the situation regarding multiplication. As before, when 
dealing with n-tuples, multiplication will be done componentwise. Therefore, given 

(a1, 42,...,@,) and (bj, bo, ..., by), we have 


(a1, d2,...,4n)*(b1, bo, ..., bn) = (a, +b), an - bo, ..., An bn). 


Observe that examples (3),(5),(6),(7),(8) are all commutative rings. But things are quite 
different in examples (1),(2),(4),(9),(10). The polynomial x belongs to the set in example 1, 
but the product x - x = x” does not. Similarly, the polynomial x” belongs to the sets in 
examples 2 and 4, but the product x7 - x” = x* does not. Note that the ordered pair (1, —1) 
belongs to the set in example 9, but the product (1, —1)-(1, —1) = (1, 1) does not satisfy the 
condition that the sum of the components is 0. Similarly, the 4-tuple (2, —1, 0, 0) belongs to 
the set in example 10, but the product (2, —1, 0, 0)- (2, —1, 0, 0) = (4, 2, 0, 0) does not satisfy 
the extra condition placed on the components. Thus, in five of our ten examples, our sets are 
not closed under multiplication. 


Before going any further, we should stop and note that some subtleties arise as we develop the 
concept of dimension. It is easy to see that every element in the set in example 9 can be written 
using two elements of Q. However, if (a, b) belongs to this set, a and b also satisfy the 
condition a+ b = 0. Therefore, a = —b, and we can now write (a, b) as (—b, b). As a result, 
although every element of this set consists of two components, we really only need a single 
element of Q in order to describe it. In this example, the number 2 describes the ordered pair 
(—2, 2), whereas the number —3 describes the element (3, _ 2). Therefore, at the moment, it 
is unclear whether the relative size of this set compared to Q should be 1 or 2. However, later 


in this chapter, it will become clear that the relative size is 1. 


Similarly, every element in example 10 is of the form (a, b, c,d), where a, b,c, d € Q and can 
be written using four elements of Q. However, since a+ 2b+5c —3d =0, we see that 

a= —2b—5c+3d. Therefore, we can now write (a, b, c,d) as (-2b —5c +. 3d, b, c, d). Thus, 
every element can now be written using only the three elements b, c, d of Q. So, at this point, 
it remains unclear whether the relative size of this set compared to Q is 3 or 4 or perhaps some 
entirely different number. 


Although ordinary multiplication is not possible in five of our ten examples, as we develop the 
concept of dimension, it will suffice to look at a somewhat different and weaker form of 
multiplication. The sets in examples 3, 7, and 8 are all commutative rings that contain Q, so 
we can always multiply elements of these sets by elements of Q. However, notice that 
although the sets in examples 1, 2, and 4 are not rings, we can always multiply elements of 
these sets by elements of Q. For example, in 4, observe that 5 - (3x — 8x?) = 15x —40x*. When 
dealing with n-tuples in 5, 6, 9, and 10, we can also multiple by elements of Q by doing the 


532 Chapter 14 


multiplication componentwise. In other words, given a € Q and n-tuple (a, a2,..., Gn), 
we have 


+ (a1, 42,...,4n) = (dj, a2, ..., An). 


An important aspect of this type of multiplication is that when we multiply an element from 
the sets in 9 or 10 by an element in Q, we remain in the set. To check this for example 9, 
suppose a € Q and suppose (a, b) is an ordered pair such that a+ b = 0. Then a: (a, b) = 
(aa, ab). To see that this element still belongs to the set in 9, observe that 


aa+ab=a(a+b)=a-0=0, 


as required. Similarly, if w € Q and if (a, b, c, d) belongs to the set in 10 then a+ 2b+ 

5c —3d = 0 and a- (a, b, c,d) = (aa, ab, ac, ad). To check that (wa, ab, ac, ad) belongs to 
the set in 10, we need to check that the first component plus twice the second component plus 
five times the third component minus three times the fourth component is 0. Observe that 


aa+2(ab)+5(ac) —3(ad) = a(a+2b+5c—3d)=a-0=0, 


as required. 


As we summarize the properties shared by all ten of our examples, we can now say that all ten 
are 


(a) commutative groups under addition and 
(b) allow multiplication by elements of Q. 


Let us now look at a second collection of examples. 


m Examples—Collection II 


1. All polynomials in R[x] of degree less than 2 = {a+ bx|a, b € R} 


2. All polynomials in R[x] of degree less than 3 = {a +bx+cx?|a, b,c € R} 
3. Rix] 


4. All polynomials in R[x] of degree less than 3 with 0 constant term 
= {ax+bx?|a,b € R} 


5. All ordered pairs of elements of R = {(a, b)|a, b € R} 


6. All ordered four-tuples of elements of R = {(a, b,c, d)|a, b, c,d € R} 


7. C= {a+ bila, b € R} 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 533 


8. All ordered pairs of elements of IR such that the sum of the components is 
0 = {(a, b)|a,b€ Randa+b=0} 


9. All four-tuples of elements of R such that the first component plus twice the second 
component plus five times the third component minus three times the fourth 
component Is 0 = {(a, b,c, d)|a, b,c,d € Rand a+2b+5c—3d=0} 


10. {f:R— Rf is continuous everywhere} 


11. {f:R—RI|f is differentiable everywhere} 


12. {f:R— Rf is continuous everywhere and (5) = 0} 


Many of these 12 examples are similar to examples from our first collection. Once again 
observe that all 12 sets in these examples are commutative groups under addition. Also note, in 
some of these examples, that it is not always possible to multiply two elements and remain in 
the set. In particular, this is the situation in 1, 2, 4, 8, and 9. However, in all 12 examples, it is 
always possible to multiply an element from one of our sets by a real number and still stay in 
the set. In our first collection of examples, all of our sets were commutative groups under 
addition in which we were also allowed to multiply by elements of Q. In our second collection 
of examples, all of our sets are commutative groups under addition in which we are also 
allowed to multiply by elements of R. Thus, as we attempt to formalize things, we need to 
realize that we are looking at two interconnected algebraic objects: a set V that is a 
commutative group under addition and a set F that is a field. The connection between V and F 
is that we can multiply elements of V by elements of F to obtain elements of V. By collecting 
the properties possessed by V and F, we are led to 


Definition 14.1. A set V is called a vector space over a field F if V is a commutative group 
under addition and elements of V can be multiplied by elements of F to produce elements of V 
such that, for all a, B © F and v, w € V, we have 


1. a-(v+w)=a-v+a-w, 
2. (a+ f)-v=a-v+ B-, 
3. a-(B-v) = (aB)-v, and 
4. 1-v=uv, where 1 is the multiplicative identity in F. 


It is important to understand the meaning of properties 1-4 in Definition 14.1. Properties 1 and 
2 are types of distributive laws. For example, suppose 


V ={(a, b)|a,b € R} and F=R. 


534 Chapter 14 


To illustrate property (1), we have 
3-((4, —7) + (8, 2)) =3- (—4, —5) = (-12, —15) 
and 
3-(4, -7)+3-(—8, 2) = (12, —21) + (—24, 6) = (—12, —15). 
Thus, 
3-((4, -7) + (—8, 2)) =3- (4, -7) +3-(—8, 2). 

To illustrate property 2, observe that 

(5+9)- (11, —5) = 14-11, —5) = (154, —70) 
and 

5- (11, —5) +9: (11, —5) = (55, —25) + (99, —45) = (154, —70). 

Thus, 

(5+9)-(1, —5) =5-(11, —5)+9- (11, —S5). 
Property 3 is a type of associative law. Remember that since F is a field, we are allowed to 
multiply elements of F’. Also, we are allowed to multiply an element of V by an element of F 
to obtain an element of V. However, we are not permitted to multiply two elements of V. 


When we analyze the terms that appear in property 3, keep in mind that wB € F, whereas 
B-v,a-(B-v), (@B)-v € V. To illustrate this property, using V and F as previously, we have 


~2.(7-(—4, 1)) = —2- (—28, 7) = (56, —14) 
and 
(-2-7)-(—4, 1) = -14- (—4, 1) = (56, -14). 
Thus, 
~2-(7-(—4, 1)) = (-2-7)-(-4, D. 


Observe that a second associative law of the form a- (vw) = (@- v)w would not make any 
sense as we are not permitted to multiply elements of V by elements of V. 


The easiest property to understand is 4. It asserts that the multiplicative identity of F continues 
to behave like a multiplicative identity, even when we multiply it by elements of V. 


An Introduction to Linear Algebra and Vector Spaces 535 


If V is a vector space over a field F, we may simply say that V is a vector space over F. 
Sometimes we are even more informal and do not mention F at all and merely say that V is a 
vector space. We call the elements of V vectors and the elements of F scalars. Multiplication 
of an element of V by an element of F is called scalar multiplication. Observe that both V 
and F have an additive identity, and, depending on the situation, this could cause some 
confusion. If the additive identity of V is of the form (0, 0) or (0, 0, 0, 0), then it looks nothing 
like the additive identity of F and no confusion arises. But, in situations where there might be 
some confusion recognizing which elements are vectors and which are scalars, vectors may be 
written with an arrow, such as v. 


In Lemma 5.14(c), we showed that multiplication by 0 in a ring always gives 0 as the answer. 
As we think about the analogous situation for vector spaces, we note that we can multiply 
vectors by the scalar 0 and can also multiply the vector 0 by scalars. We would expect that, in 
both cases, we would obtain 0 as the answer. The next lemma shows that this is the case and 
also provides us with some basic facts that will be useful when performing various 
computations within a vector space. When reading the following proofs, make sure you 
understand why all the various equalities hold. In virtually every case, equality will be a 
consequence of the associative and distributive laws along with the properties of additive 
identities and inverses. 


Lemma 14.2. If V is a vector space over a field F, then 
(a) 0-v=0, forall ve V. 
(b) a-0=0, forallae F. 


(c) (—«a)-v is the additive inverse of a-v, for alla € F and vé V. In particular (—1)-v is 
the additive inverse of v. 


(d) Ifa, Be Fand ve V such that a-v= B-v, then eithera = B orv= 0. In particular, if 
a:-v=0, then eithera=0orv=O0. 


Proof. For part (a), if v € V, we have 


0-v=0-0+0=0-0+(0-v—-0-v) = 
(0-v+0-v) —0-v=(0+0)-v—0-v=0-v—0-v=0. 


For part (b), if a € F, we have 


536 Chapter 14 


For part (c), if v € V, we can use part (a) to obtain 
a-v+(—a)-v=(a—a)-v=0-v=0. 


Since addition is commutative, we also know that (-a@)-v+a-v= 0. Therefore, when we add 
(—a)-v to a-v, we obtain the additive identity of V. Thus (—q@) - v is the additive inverse of 
a-v. If we specialize to the situation where a = 1, we can see that (—1)- v is the additive 
inverse of 1-v=v. 


For part (d), suppose w- v = B- v; we need to show that if aw ~ 6 then v = 0. It now follows from 
part (c) that (a — B)-v=0. If a ¥ B, then a — f is invertible, and if we multiply the equation 


0=(a—f)-v 


by (a— B)~! and use part (b), we obtain 


> 


0 = (@-)'-0 = @—f)"((a— B)-v) =(@—f)'(@—f))-v=1-v=v. 


Thus, if a ¢ 6, then v= 0, as desired. 


Finally, if a-v= 0 then, using part (a), we have a-v = 0- v. It now follows from our previous 
argument that either a = 0 or v= 0. 


Having looked at many examples of sets which are vector spaces, it is also instructive to look 
at some sets which are not vector spaces. 
m Examples—Some Sets That Are Not Vector Spaces over 
1. Let Vj = {(a, b)|a,b € Q and a-b => 0}; observe that 
(2,1),(-1,-2) EV, but (2,1)+(—1, —2) =(,-1) ¢ Vy. 

Thus, Vj is not closed under addition and is therefore not a vector space. 

2. Let V2 = {a+ bx € Q[x]|a? = b*}; then 
l+x,l—-xeV, but (d4+x)+0-x)=2¢V). 
Thus, V2 is also not closed under addition, so it is also not a vector space. 


3. Let V3 = {(a, b)|a,b € Q and a+b = 1}; then (1,0) € V3 but 2- (1, 0) = (2, 0) ¢ V3. 
Therefore, scalar multiplication is not possible in V3. You should also check that V3 is 


not closed under addition. 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 537 


4. Let V4 = {(a, b)|a, b € Z}; although V4 is a group under addition, scalar 
multiplication is not possible as 


1 1 1 
(1,1) € V4 and 7° U 1)= (5. 5) ¢ V4. 


Suppose V = {(a, b)|a, b € Q} and W = {(a, b)|a, b € R}; then certainly V is a vector space 
over Q and W is a vector space over R. Note that W is also a vector space over Q. More 
generally, if a set is a vector space over a field F, then it is also a vector space over any field 
contained in F. However, a similar result does not hold when we move to a field that is larger 
than F’. In particular, V is not a vector space over R. Observe that (0, 1) € V but af Ds (0,1) = 
(0, V2) ¢ V, so scalar multiplication is not possible in V. 


We now need to focus our attention on two examples of particular importance. 


m Examples—Two Fundamental Examples 
1. If Fisa field, let 


F" = {(aj, a,..., ay) \|a; € F}. 


F" is a vector space over F where both addition and scalar multiplication are done 
componentwise. 


2. If FC K are fields, then K is a vector space over F’. To see this, we first note that K is 
certainly a group under addition. Furthermore, since K isa field, K satisfies the 
distributive laws, its multiplication is associative, and it has a multiplicative identity. 
Therefore, properties 1-4 of Definition 14.1 hold for all elements of K. However, 
since F C K, scalar multiplication of elements of K by elements of F is merely a 
special case of multiplication in K. Thus, scalar multiplication automatically inherits 
properties 1-4 of Definition 14.1. 


In Chapter 5, we remarked that if R is a commutative ring, then R might be a field but it need 
not be a field. Similarly, if V is a vector space over F,, then V might also be a commutative 
ring or field but it need not be. For example, Q[x] is not only a vector space over Q but is also 
a commutative ring. On the other hand, V = {ax+ bx?|a, b € Q} is a vector space over Q but is 
not a commutative ring. When dealing with fields F C K, in some applications it is useful to 
temporarily ignore the fact that K is a field and to simply think of K as being a vector space 
over F. 


www.elsevierdirect.com 


538 Chapter 14 
Exercises for Section 14.1 
In exercises 1-14, you will be doing computations in 
Q* = {(a, b,c, d)|a, b,c, d € Q}, 
which is a vector space over Q. In Q’, let 


vy = (5,0,0,6), v2=(1,2,0,0), v3 = (0, 30,0, —18), 
v4 = (0,6,4,0), vs =(1,0,5, 2). 


1. Compute 2p. 

2. Compute — $03. 

3. Compute 3v2 —6v4. 

4. Compute 2v3 — 3 U4. 

5. Compute —v; + 4v2 — $US. 

6. Compute 3v2 — iu, +705. 

7. Compute —S5v2 —7v3 + v4 — 60s. 

8. Compute —2v; + up — 603 + U5. 

9. Compute 4v; — v2 + 8v3 —5vu4+ Ils. 
10. Compute —3v; —7v2 + 03+ 4+ 9s. 
11. Can you find a, 6 € Q such that v3 = a-v,+ B- v2? 
12. Can you find a, 6 € Q such that vj =a@-v2+ B- v3? 
13. Can you find a, 6 € Q such that v5 =a@-v; + B- v4? 
14. Show that if a, B, y, 6 € Q such that a@-v2+ B-v3+y-v4+6-v5 = (0,0, 0,0), then 


a=p=y=6=0. 


In exercises 15—30, you will be doing computations in R[x], which is a vector space over R. In 
R[x], let 


Ww] = 3x7 — 1, wo=4x+5, w3 =x —2x, 


wa=2x°+5, ws=9x—-2. 


15. Compute J/2w>. 


16. Compute 5 Wa. 


An Introduction to Linear Algebra and Vector Spaces 


17. Compute 7w,;+7w3. 

18. Compute —3w3+ ss. 

19. Compute 11lw,— Sw. + Sw. 

20. Compute —/7w3—6wa+5ws. 

21. Compute 4w) — 3w3+3w4—-Tus. 

22. Compute 6w, + w3— Wa —4ws. 

23. Compute 2w, +3w2—5w3—w4+7ws. 

24. Compute —9w, —6w2+8w3-—w4t2ws. 

25. Can you find a, 6B € R such that w; = @- w2+ B-w3? 
26. Can you find a, B € R such that wq4 = a@- w2+ B-w3? 


27. Can you find a, 6B € R such that 1 =a@-w2+ B- ws? 


28. Can you find a, 6 € R such that x = a@-w2+ B- ws? 
29. Can you find a, B, y € R such that x7 = @- w+ B- waty-ws? 


30. Show that if a, 6, y,d € R such thata-w,+ B-w2+y-w3+6-ws =0, then 
a=Pp=y=6=0. 


In exercises 31-42, determine if the given set is a vector space over Q. If it is not a 
vector space, briefly explain why it fails to be a vector space. 


31. { f(x) € QLx]| F110) = 0}. 
32. { f(x) € QLx]| f10) > O}. 
33. { f(x) € QR fA) = fQ)}. 
34. { f(x) € Q[x]| f() has a root in Q}. 
35. { f(x) € QLx]| (0) € Z}. 
36. {g(x) € R[x]|g(0) € Q}. 
37. {g(x) € RIx]|g'(0) € Qh. 
38. {g(x) € RIx]|g@ = 0}. 

39. {g(x) € RIx]lg@ € Qh. 
40. {g(x) € R[Ix]|g@ € Z}. 

41. Q(v2) 


539 


540 Chapter 14 


42. {reQ(v2)|r? € Qh. 


In exercises 43-52, determine if the given subset of IR‘ is a vector space over R. If it is not a 
vector space, briefly explain why it fails to be a vector space. 


43. {(a,b,c,d) € R*|a=b and c = d}. 

44. {(a,b,c,d) € R4|a =b’}. 

45. {(a,b,c,d) € R*|b=c =0}. 

46. {(a,b,c,d) € R*|a = 2b—3c}. 

47. {(a,b,c,d) € R*|ab > 0}. 

48. {(a,b,c,d) € R*]a+b <c+d}. 

49. {(a,b, c,d) € R*|2a+4c = 5b+d}. 

50. {(a,b,c,d) € R*|b =c and d = 0}. 

51. {(a,b, c,d) € R4|a? +b? +c? +d? > 0}. 
52. {(a,b,c,d) € R*|ae Q. 


14.2 Spanning Sets and Linear Independence 


Now that we have worked with quite a few examples of vector spaces, we can start 
introducing the concepts that will lead us to an understanding of dimension. 


Definition 14.3. If V is a vector space over a field F and if v1, v2, ..., Un € V, then a linear 
combination of v1, V2, ..., Un is any element of V that can be written in the form o-v, + 
2+ V2+:+++,-U,, where each a; € F. 
m Examples 
1. V=Q[X] is a vector space over Q and let us examine 1, x, x? € Q[x]. The elements 
32 14-4 x4 (—S)? S — 5 + +, 
0-1+0-x+2-x? = 2x’, 

: 142-x+0-x? =2x4+ : 

—. “Xx ‘ES ZX =; 

3 3 

0-14+0-x+0-x7=0 


are all linear combinations of 1, x, x7. However, observe that x°, 2x+—1, and 


. . ° » 
x° —6x+1 are not linear combinations of 1, x, x”. 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 541 


2. V=Q isa vector space over Q and let us examine (1, 2, 3), (—1, 0, 2) € Q>. The 
elements 
2-(1,2,3)+5-(—1,0, 2) = (-3, 4, 16), 
(—4)- (1, 2,3) +3-(—1, 0, 2) = (—7, —8, —6), 
0-(1,2,3)+(—8)- (—1, 0, 2) = (8, 0, —16) 


are all linear combinations of (1, 2, 3), (—1, 0, 2). 


Suppose we wish to determine if (1, 0, 1) is a linear combination of (1, 2,3) and 
(—1,0, 2). This would mean that there exist a, 6 € Q such that 


(2) a (1,2,3)+ 6+ (—1,0,2)= (1,0, 1). 
However, 
(4) a-(1,2,3)+6-(—1,0, 2) = (a— B, 2a, 3a+26). 


In light of equations (3) and (4), a and 6 would simultaneously need to satisfy the 
equations 


(5) a—-B=1, 20=0, 30+26=1. 


The only a, B which satisfy the first two equations in (5) area = 0, 8B = —1. But these 
values of a, B do not satisfy the third equation in (5). Thus, there are no a, B € Q that 
simultaneously satisfy the three equations in (5). Therefore, (1,0, 1) is not a linear 
combination of (1, 2,3) and (—1, 0, 2). 


Now suppose we wish to determine if (10, 12, 12) is a linear combination of (1, 2,3) and 
(—1, 0, 2). Using (4), we see that we need to find a, B € Q such that 


(6) a—-B=10, 20=12, 3a+2f6=10. 


If we solve the first two equations in (6), we see that a = 6, 8 = —4 is a solution. These 
values of a, B are also a solution to the third equation in (6). Thus, 
(10, 12, 10) = 6- (1, 2, 3) + (4) - (—1, 0, 2) 
and so, (10, 12, 10) is a linear combination of (1, 2,3) and (—1, 0, 2). 
Eee a 


The calculations done in the previous example certainly suggest that there is a connection 
between linear equations and our work on vector spaces. 


www.elsevierdirect.com 


542 Chapter 14 


Definition 14.4. If V is a vector space over F and if S is a finite subset of V then the span of 
S, written as span(S), is the set of all linear combinations of elements of S. If span(S) = V, 
then we say that S spans V or S is a spanning set of V. 


m Examples 


1. Let V = Q[x]; this is a vector space over Q, and we will look at the span of various 
subsets of V. 


(a) If S; = {1, x}, then span(S;) consists of those elements of V that can be written 


in the form a- 1+ 8-x, where a, B € Q. Thus, span(S}) is the set of all 
polynomials in Q[x] of degree less than 2. In particular, x? ¢ span(S}). 


(b) If Sp = {x, x7}, then span(S2) consists of those elements of V that can be 
written in the form a: x-+ B-x?, where a, B € Q. Thus, span(S>) is the set of all 
polynomials in Q[x] of degree less than 3 whose constant term is 0. As a result, 
neither x2 + 1 nor x? belongs to span(S2). 


(c) If S3 = {1, x7, x4, x® x8} then span(S$3) consists of all polynomials in Q[x] of 
degree less than 9 such that only terms with an even exponent appear. For 
example, 1 ag ox? — ~ € span(S3), whereas x!° and 5x®° — 8x? do not 
belong to span(S3). 


2. Let W=R’; this is a vector space over R, and we will look at the span of various 
subsets of W. 


(a) If S;={(1,0,0), (0, 1,0), (0,0, 1)}, then S; spans W. To see this, observe that if 
(a, B, y) € R3, then 


(a, B, y) =a-(1,0,0)+ B- (0, 1,0) +y- (0,0, 1) € span(S)). 


(b) If Sy = {(0, 1,0), (0,0, 1)}, then span(Sz) consists of all those elements in W 
whose first component is 0. To see this, note that 


(0, a, B) = a- (0, 1,0) +B: (0,0, 1) € span(S2). 


(c) If S3 = {(6, 6, 6)}, then span(S3) consists of all those elements in W where all 
three components are equal. Observe that 


(a, a, a) = = (6, 6, 6) € span(S3) . 
(d) Let S4= {(1, 2,0), (2, 1,0), (0,0, 1)}; we claim that Sy spans W. In example (a), 


we showed that the set S; = {(1, 0,0), (0, 1,0), (0, 0, 1)} spans W. We will now 
show that all three members of S; belong to span(S4). For the vector (0, 0, 1), 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 543 


there is nothing to prove. To see that the first two elements of S; belong to 
span(S4), observe that 


1 2 
(1,0,0) =— 3 “thse -(2, 1,0) and (0,1, 0) = 5-1, 2,04 (2, 1,0). 


Having shown that S; C span(S4), we can see that if (a, B, y) € IR?, then 


(a, B, y) =a-(1,0,0)+ B- (0, 1,0)+ yO, 0, 1) = 


a-(-$.0,2,.043-2,1.0) +6 (= (1,2,0)+-Q,1,0)) +y 0;0,1)= 


- (1, 2,0 
3 ( er 


= 2 2a — 
au SS P 2,1,0)+y-(0,0, 1) € span(Sq4) . 


This example illustrates the useful fact that in order to show that a set T spans W, it suffices to 
show that span(T) contains some other spanning set S. We record this as 


Lemma 14.5. Let V be a vector space over F and let S and T be finite subsets of V. If 
SC span(T), then span(S) © span(T). Therefore, if S C span(T) and S spans V, then T also 
spans V. 


Proof: Suppose S C span(T) and let s € span(S ), we need to show that s € span(T ). Since 
s € span(S), there exist v; € S and a; € F such that 


(7) S = -V~_ +A2+ V2 +--+ +n: Vy. 
However, each v; € span(T ), so there exist w; € T and Bj; € F such that 
(8) vj = Bi: wit Bi2-wo+-:-+Bim- Wm, 


for all i < n. Therefore, in (7), we can replace each v; by a linear combination of the w ’s from 
(8) to obtain 


S§= ay: (By > wit Biz: wW2+-+++ Bim: Wm) + 2 - (Bor: wi t+ Bo2- w2 +--+ + Bam: Wm)+ 
On (Bri Wi + Bn2 + wW2 +--+ + Bam: Wm) = 
(a1 B11 +02 B21 +--+ + On Bri) Wi + (a1 B12 + 2 B22 + +++ +n Bn2)- wW2+ 
»>+ (1 Bim +028 am +-++ +n Bam):*Wm € Span(T). 


Thus, span(S) C span(T). 


www.elsevierdirect.com 


544 Chapter 14 


It now easily follows that if S§ C span(T) and S spans V, then V = span(S) C span(T). Hence, 
T spans V. 


If we consider the vector space Q? over Q, it is easy to see that S = {(1,0), (1, 1), (, 1)} 
spans Q?. In this case, elements of Q? can be written in several different ways as linear 
combinations of elements of S. For example, we have 


(4,2) =2-0,0)+2-,1)+0-@,]), 
(4,2) =0-(1,0)+4-(1,1)+(-2)-@, 1), 
(4,2) =4-(1,0)+0-(1,1)+2-(, 1). 
In fact, not only can Q? be spanned by S, but it can also be spanned by any subset of S that 
contains two elements. Thus, S is a spanning set for V, but it is not the smallest possible 
spanning set for V. It turns out that given a vector space V over a field F’, we will be interested 


in sets that not only span V but have the additional property that no smaller subset spans V. To 
help us better understand this idea, we have 


Lemma 14.6. J[f V is a vector space over a field F and S = {v,, v2,..., Uy} is a finite subset 
of V, then the following statements are equivalent: 


(a) No element of S can be written as a linear combination of the other elements of S. 


(b) For every w € span(S), there is only one way to express w as a linear combination of the 
elements of S. 


(c) The only way to express 0 as a linear combination of the elements of S is 
6=0- 1 +0-v2+-+-+0- dp. 


Proof. Let us first suppose that statement (a) is true; we will prove that statement (b) also 
holds. If w € span(S), then there exist a; € F such that 


W = Q1- Vi +A2°U2 +++ + An: Up. 


To show that this is the only way to express w as a linear combination of the elements of S, we 
must show that if there exist 6; € F such that 


w= B,-v; + Bo- va +--+ Bn: Un, 


then a; = 6;, for all i <n. We will proceed with a proof by contradiction, so let us assume that 
there is some subscript j such that a; 4 £;. Using the fact that 


yp Vp tz: va+--+ ++ Uy = BL vy + Bo-v2+-+++ Bn: Un, 


An Introduction to Linear Algebra and Vector Spaces 545 


we can subtract various elements from both sides of this equation so that the v; term will be on 
the left side of equation and all the other v;’s will be on the right side. After doing this, we 
obtain 


(a; — Bj): vj =(B1 — 1) - 0) +++ + (Bj-1 — @j_-1) - Vj-1 
(9) + (Bj 41 — Oj41): Upp t+ +++ (Bn — On): Un. 


Since a; # B;, the element a; — 6; has a multiplicative inverse in F’. If we multiply both sides 
of (9) by (a; — Bj)! and let y; = (a; — B;)~'(B; — a), for i < n, we obtain 


Vj=VMe Vite HF Vj-1- Vj-1 FV j41 Vig te FH Yn Une 


But this is a contradiction, as we have written v; as a linear combination of the other elements 
of S. Therefore, it must be the case that a; = 8;, for all i <n, and there is indeed only one way 
to express w as a linear combination of the elements of S. 


If statement (b) is true, it is quite easy to show that statement (c) is also true. Lemma 14.2(c) 
told us that 0- v = 0, for all v € V. Therefore, it immediately follows that 


0=0-0,+0-v2 +--+: +0- dp. 


As a result, we have certainly illustrated one way to express 0 as a linear combination of the 
elements of S. However, statement (b) asserts that this must be the only one way to express 0 
as a linear combination of the elements of S. 


Finally, let us suppose statement (c) is true, and we will prove that statement (a) is true. By 
way of contradiction, let us suppose that there is some v; € S that can be written as a linear 
combination of the others. Therefore, there exist a; € F such that 


Vj HOY Vy Hee FO] Vj—-1 FOE Vi HF An* Un. 


If we subtract v; from both sides and apply Lemma 14.2(c), we obtain 
O= ay vp tee taj vert (HD - vj torr vite ton: Up 


However, this equation is a second way to express 0 as a linear combination of the elements of 
S. This contradicts statement (c), thereby concluding the proof. 


We now give a name to those sets which satisfy the conditions in the previous lemma. 


Definition 14.7. Let V be a vector space over a field F. We say that a set of vectors 
{v1, U2, ..., Un} is linearly independent if it satisfies any of the conditions in Lemma 14.6. 
If the set {v1, v2, ..., Un} is not linearly independent, we say that it is linearly dependent. 


546 Chapter 14 


To convince yourself that you understand Definition 14.7, you should try to prove that if a set 
S is linearly independent, then every nonempty subset of S is also linearly independent. 


If you are given a set of vectors, it is often quite tedious to determine whether or not they are 
linearly independent. After looking at the set for awhile, you might get a sense of whether one 
of the vectors can be written as a linear combination of the others. If you succeed in showing 
that one of the vectors can be written this way, then you have shown that they are linearly 
dependent. On the other hand, you may feel that there is only one way to express 0 as a linear 
combination of the vectors in the set. If you show this, then you will have shown that they are 
linearly independent. Regardless of whether the vectors ultimately turn out to be linearly 
independent or linearly dependent, the computation involved once again indicates the link 
between vector spaces and linear equations. 


m Examples 


Q[x] is a vector space over Q; we will look at various subsets of Q[x] and will determine if 
they are linearly independent. 


1. Let S} = {4x?+.x,3x+ 1,5}, we want to check if there exists a second linear 
combination of the elements of S which is equal to 0. To this end, suppose 
a, B, y € Q such that 


O=a- (4x7 +x)+ B-Bx+1)+y-(5). 


The right side of the previous equation is equal to 4ax” + (a +3h)x+ (B+5y). Recall 
that the only way a polynomial can be equal to the zero polynomial is for all its 
coefficients to be equal to 0. Hence, a, 8, y must simultaneously be solutions of 


4a=0, a+36=0, B+5y=0. 


It is not hard to see that the only solution to these equations isa = 6B = y= 0. Thus, 
there is only one way to express 0 as a linear combination of the vectors in S;, so 
these vectors are indeed linearly independent. 


2. Let Sp = {x?+1,4, 2x7}; if you noticed that 


P41==-Qx2)42-(4) 
Re Aer 


then it is clear that the vectors in S» are linearly dependent. If you did not notice this, 
then you could check if there is a second way to express 0 as a linear combination of 
the elements of Sz. Therefore, let us now suppose that a, B, y € Q such that 


O=a-(x7+1)+B-(4)+y- (2x’). 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 547 


The polynomial on the right side is equal to (a+ 2y)x? + (@ +4). Asa result, a, B, y 
must be solutions of 


a+2y=0, a+468=0. 


There are many solutions to these equations, in particular a = —4, B= 1, y=2 is 
one of them. Thus, the vectors in S> are linearly dependent. 


3. Let S3 = {x, x3, x°}; if'a linear combination of these vectors was equal to 0, then there 
exist a, B, y € Q such that 


O=a-x+B-ety-x. 


However, this immediately implies that a = 8B = y = 0. Therefore, there is only one 
way to express 0) as a linear combination of elements of 53, so the elements in S3 are 
linearly independent. 


4. Let Sy= {x+ 1xe+1,2°+ 1}; to check if these vectors are linearly independent, 
suppose there exist a, B, y € Q such that 


0=e-041+ 0-4 Dee @ +N. 
As a result, 
yx? + Bxr+(a+B+y) =0. 


This immediately implies that a = 6 = y = 0. Thus, the elements in Sq are linearly 
independent. 


To obtain more experience with linearly independent sets and spanning sets, we will look at 
various subsets of R?. 


™ Examples—Linearly Independent and Spanning Sets in R* 


The set IR? is a vector space over R. We will now list various subsets of R? that are linearly 
independent and some that are spanning sets and will then make some observations. 


Linearly Independent Sets Spanning Sets 


{(0, 1), 1, 0)} {(, 1), (1, 0)} 

{(1, 2)} {(, 1), Cd, 1), C, 0)} 

{(3,4), 6, -D} {(2,4), 3, ),6,-D} 

{(1, 2), (=2, O)} {C1, 2), (=2, 0)} 
(Continued ) 


www-.elsevierdirect.com 


548 Chapter 14 


Linearly Independent Sets Spanning Sets 


{(0, 1)} (5, 143, 1),5,0),.(7.3)t 
{(1, l} {(4,5), (6, 7), (8, 5)} 
{(—2, 6), (—2, 5)} {(7, 4), (—8, 11} 


We can now make the following observations: 
1. The number of elements in each of the linearly independent sets is less than or equal 
to the number of elements in each of the spanning sets. 


2. The sets that are on both lists contain exactly two elements. 


When we intuitively think of the relative size of R? compared to R, we think of R? as being 
two times as large as R. At the same time, every subset of R? in our example that is both 
linearly independent and spans R? has two elements. This is no coincidence. 


Similarly, if you look at various subsets of the vector space Q? over Q, then you will observe 
that every linearly independent subset has at most 3 elements and every spanning set has at 
least 3 elements. Therefore, those subsets of Q? that are both linearly independent and span Q? 
have exactly 3 elements. Again note that the number 3 agrees with our intuitive notion of what 
the relative size of Q? is compared to Q. 


14.3 Basis and Dimension 


Having looked at quite a few examples, we can now take a more general view of things. 
Suppose V is a vector space over a field F, and let S$ = {v1, v2,..., Un} be a subset of V, which 
is both linearly independent and spans V. If (a1, a@2,..., @,) is an n-tuple of elements of F, 
then we can associate to this n-tuple the element 


W=Q1-Vy tQ2: Va +++: +Ay- Un € Vz 


Since S spans V, every element of V can be obtained in this way from S and an n-tuple of 
elements of F’. Furthermore, since S is linearly independent, Lemma 14.6(b) asserts that 
different n-tuples of elements of F always produce different elements of V. As a result, using 
S, there is now a one-to-one correspondence between the elements of V and n-tuples of 
elements of F. Having now seen that every element of V can be expressed uniquely using n 
elements of F’, it makes sense to say that the relative size of V compared to F is equal to the 
number of elements in S. This appears to have successfully formalized the idea of relative 
size. However, one problem remains. If we believe that the relative size of V compared to F 
should be the number of elements in a set that is both linearly independent and spans V, 


An Introduction to Linear Algebra and Vector Spaces 549 


we need to deal with the fact that V can have many different subsets which are both linearly 
independent and span V. Therefore, for the idea of relative size to truly makes sense, we must 
deal with the following. 


Question: If a vector space V has various subsets that are simultaneously linearly independent 
and span V, do all these subsets have the same size? 


This is a fundamental question, and its answer will appear in Corollary 14.10. In order to 
prove Corollary 14.10, we first need to prove Theorem 14.8. It asserts that, in a vector space, 
the number of elements in a spanning set cannot be smaller than the number of elements 

in a linearly independent set. Once we prove Theorem 14.8, it will be easy to prove 
Corollary 14.10. The combination of Theorem 14.8 and Corollary 14.10 is the foundation 
behind much of linear algebra. Not surprisingly, they are the main results of this section, and 
they will allow us to completely formalize and understand the concept of dimension. 


Theorem 14.8. Let V be a vector space over a field F. If {v1, v2, ..., Vs} spans V and if 
{w,, W2,...w,} is a linearly independent subset of V, then s > r. 


Once we prove Theorem 14.8, it will be easy to prove that any two subsets of V that are both 
linearly independent and spanning must have the same size. However, to prove Theorem 14.8, 
we first need 


Lemma 14.9. Let V be a vector space over F and let e, €2,...,@n, és fi, fo, .-+, fm be 
elements of V such that 


(a) the set {€1, €2,..., en, e} is linearly independent and 


(b) the set {e1, €2,...,€n, fi, fo,.--; fm} spans V. 


Then there exists some k < m such that the set 


T = {e1, €2,...,€n,e, fis fas--+s fe-ts fetis--+> Sn} 


also spans V. In other words, the element e can replace one of the f;’s in the set in (b) and the 
new Set will still span V. 


Proof. Since the set {e1, €2,...,€n, fi, fo,---, fm} spans V, we know that there exist 
a;, B; € F such that 


(10) €= 1°) $02-€2+-+-+On- en t+ Br: fi + Bo: fot---+Bm- fm- 
If all the 6;’s in (10) were 0, equation (10) would reduce to 


€= Oey +A2-€2+++++An: en. 


550 Chapter 14 


However, this contradicts the fact that the set {e1, e2,..., en, e} is linearly independent. 
Therefore, there exists some subscript k < m such that 6, 4 0. If we subtract e and pb, - f; from 
both sides of (10) and apply Lemma 14.2(c), we obtain 


=f Fe Cire 103 eo ee a, eg Pt te 
Bi: fit Bo: fot-:>+Be-1- fe-1 t+ Beri: fei t+ + Bm: fm- 


Let T = {€1, €2,...,€n,€, fi, fo, --+s fe-15 Se+1,--+» fm}; we would like to show that 

fx € span(T ). But when we multiply both sides of our previous equation by (—B,)~!, it 
becomes clear that f, belongs to span(T ). Therefore, span(T ) contains every element of the 
original spanning set and Lemma 14.5 now asserts that T must also span V. 


Lemma 14.9 is the key to the proof of Theorem 14.8 as it allows us to replace, one element at a 
time, the elements of the spanning set {v1, v2, ..., vs} by elements of the linearly independent 
set {w1, W2,...w;}. 


Proof of Theorem 14.8. By way of contradiction, let us assume that s < r. Since {w,} is a 
linearly independent set and {v1, v2, ..., vs} is a spanning set, Lemma 14.9 asserts that we can 
replace one of the v;’s by w) and still have a spanning set. By reordering the v;’s, we may 
assume that we can replace v; by w;. Therefore, we may now assume that {w), v2,..., us} isa 
spanning set. 


But now, {w 1, wz} is a linearly independent set and {wy , v2,..., vs} is a spanning set. Again 
applying Lemma 14.9, we can replace one of the remaining v;’s by w2 and still have a 
spanning set. By reordering the remaining v;’s, we may assume that we can replace v2 by w2. 
Thus, {w), w2, v3,..., Us} is a Spanning set. Observe that we are now in the case where 

{Ww 1, W2, w3} is a linearly independent set and {w1, w2, v3,..., vs} is a spanning set. Note 
that we can again apply Lemma 14.9 and reorder the remaining v;’s so that {w1, w2, w3, 
U4,..., Vs} iS a Spanning set. 


Since we are assuming that s <r, we can continue to apply Lemma 14.9 and replace 
elements of the spanning set {v1, v2,..., vs} by elements of the linearly independent set 
{w1, W2,...w,} until we have exhausted all the v;’s. When we have reached the point that 
we have replaced every v;, we see that the set {w), w2,... ws} is also a spanning set. But 
since s <r, we can talk about the element w,+,. Since {w,, w2,... ws} is a spanning set, 
Ws41 must be in the span of these other w;’s. However, by Lemma 14.6(a), no element of a 
linearly independent set can be in the span of the others and this contradicts the fact that 
{W1, W2,...Ws, Ws41} is linearly independent. As a result, it is impossible for s to be less 
than r and s > r, as desired. 


We can now easily prove 


An Introduction to Linear Algebra and Vector Spaces 551 


Corollary 14.10. Let V be a vector space over a field F and let S and T be finite subsets of 
V, both of which are linearly independent and span V. Then S and T have the same number of 
elements. 


Proof. Let n be the number of elements in S and m the number of elements in 7. Since S is 
a spanning set and T is linearly independent, Theorem 14.8 tells us that n > m. Changing 
our perspective, it is also the case that S is linearly independent and T is a spanning set. Now 
Theorem 14.8 tells us that m > n. Combining these two facts, we see that n = m, as 
desired. 


In light of Corollary 14.10, all subsets of a vector space V that are linearly independent and 
span V must have the same size. Therefore, it now makes perfect sense to consider the 
relative size of a vector space V compared to the field F to be the number of elements in a 
subset of V that is both linearly independent and spanning. Clearly, sets that are both linearly 
independent and spanning are quite important, and this motivates the following: 


Definition 14.11. Let V be a vector space over a field F. If the set S = {v,, v2, ..., Un} is both 
linearly independent and spans V, then we call S a basis of V. We call the number of elements 
in S the dimension of V over F. 


Given a vector space V over a field F, it does not have a unique basis. However, 

Corollary 14.10 tells us that every basis of V over F has the same number of elements. Thus, 
we have achieved our goal of showing that the concept of dimension is well defined. For 
example, observe that in the vector space R? over R, each of the sets 


{(1,0),@,D}, {@,2),G,D}, ((-1,3),7,9} 


is a basis. Corollary 14.10 tells us that every basis of IR? over R has 2 elements in it. Thus, the 
dimension of R? over R is 2. 


m Examples—Some Vector Spaces and Their Dimensions 


In each of these examples, we will write down one basis for each vector space. However, 
keep in mind that for each of these examples, there are an infinite number of choices you 
could make for a basis. But always remember that every basis for a given vector space 
must have the same number of elements. 

1. If Fis a field, then F” is a vector space over F. The set 


(005 09,10, 10) 5 ca OV 00) Ly cacO)5c425 (100, Opanas 1) 


is a basis. This set has n elements, thus Ff” has dimension n over F’. For example, 
{(1,0), (0, 1)} is a basis for R? over R, hence R? has dimension 2 over R. Similarly, 
{(1,0, 0), (0, 1, 0), (0, 0, 1)} is a basis for Q? over Q, so Q? has dimension 3 over Q. 


552 Chapter 14 


2. Cis a vector space over R and {1, i} is a basis. Therefore, C has dimension 2 over R. 


3. Q(V3) = {a+ BV3]a, B € Q} is a vector space over Q and {1, /3} is certainly a 
spanning set. But to be a basis, it also needs to be linearly independent. To this end, 
suppose there exist a, B € Q such that a: | + B-/3=0. If 8 £0, then /3 = =F EQ, 


which contradicts the fact that /3 is not rational. Therefore, 6 = 0, and it is now 
easy to see that a = 0. Thus, both a and 6 must be 0, and the set {1, /3} is indeed 
linearly independent. Hence, Q(V3) has dimension 2 over Q. 


4. If F isa field, let P,(x) denote all the polynomials in F[x] of degree less than n. 
Recall that we have adopted the convention that P,(x) contains the polynomial 
f(x) = 0. Then P,(x) is a vector space over F, and it is not hard to check that the set 
{1,x,x7,...,x"7|} isa basis. Thus, P,(x) has dimension n over F. 


5. Let V be the polynomials in P, (x) with constant term 0. In this case, {x,x7,..., x”~!} 
is a basis for V over F, so V has dimension n — 1 over F. 


6. Ifn > 2, let W be the polynomials in P, (x) that have | as a root. Every element of W 
must simultaneously have degree less than n and also have x — | as a factor. 


Therefore, if f(x) € W, then f(x) = q(x)(x— 1), where g(x) =a) +ajx+---+ 


Qt, 2X"~7. We can now rewrite the equation f(x) = q(x)(x— 1) as 


F(x) = a(x — I) Foxe — 1) +--+ + y—2x" 72-1). 


Therefore, the set {(x—1), x(x—1), x*(x—1),...,x7-7(x-— 1} spans W. Observe 
that this set is also linearly independent, for if 


ar (x — 1) +ayx(x— 1) +-+++on—2x" *(x— 1) =0, 
then 
(og + ox +++ ++ On—2x"~*)(x— 1) = 0, 
which implies that 
ag tayxt-+++a,-9x" 7 =0. 


Thus, each a; =0, so {(x—1), x(x —1), x?(x— 1), ..., x"-?(x— 1)} is also linearly 
independent. Since {(x — 1), x(x— 1), x?(x—1),...,x”-?(x —1)} has n—1 elements, 
W has dimension n — | over F. 


7. Earlier in this chapter, we looked at the vector space 


V={(a,b)|a,be€Q and a+b=0} 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 553 


and wondered if its dimension over Q should be | or 2. The set {(—1, 1)} isa spanning 
set for V over Q, and it also follows from Lemma 14.2(d) that any subset of a vector 
space consisting of only one element is linearly independent provided the element is 
not 0. Thus, the set {(—1, 1)} is a basis for V over Q and V has dimension | over Q. 


8. Earlier in this chapter, we also examined the vector space 
W = {(a, b,c, d)|a,b,c,d€Q and a+2b+5c—3d =O}. 


At the time, it was unclear if the dimension was 3, or 4, or some entirely different 
number. You can now check that {(—2, 1, 0,0), (—5, 0, 1, 0), (3, 0, 0, 1)} is a basis for 
W over Q. Thus, W has dimension 3 over Q. 


Throughout this chapter, we have seen that linearly independent sets need not be spanning 
sets, and spanning sets need not be linearly independent. However, we will now show that 

if a vector space V has dimension n, then a set with n elements is linearly independent if and 
only if it spans V. In light of this, if you already know a vector space has dimension n, then to 
show that a set with n elements is a basis, you only need to verify that the set is either 
linearly independent or spans V. We record this as 


Lemma 14.12. Let V be a vector space of dimension n over the field F. Then a set 
S = {v1, V2,..., Un} is linearly independent if and only if it spans V. 


Proof. In one direction, let us suppose that S is linearly independent. We need to show that 
every w € V belongs to span(S). Certainly, if w € S, then there is nothing to prove. On the 
other hand, if w ¢ S, then the set SU {w} contains n + 1 elements. However, since every basis 
of V contains n elements, V certainly contains subsets of size n that are spanning sets. 
Therefore, Theorem 14.10 implies that the set SU {w} is not linearly independent. As a result, 
there exist a;, a € F, not all of which are 0, such that 


(11) Vp tag: Vg+-+-+An-Up ta-w=O0. 


Observe that w cannot be 0, otherwise S would be linearly dependent. Since a 4 0, we can 
subtract a - w from both sides of (14.11) and then multiply both sides by —a~! to see that 
w € span(S ). Thus, S does span V. 


In the other direction, let us suppose that S spans V and we will show that S is linearly 
independent using a proof by contradiction. If S is not linearly independent, then Lemma 14.6 
tells us that some v; € S is in the span of the remaining elements. By reordering the elements 
of S, we may assume that v, is in the span of the set T = {v2, v3, ..., Un}. However, since 

v; € span(T ), T has the property that span(T ) contains the spanning set S and Lemma 14.5 


554 Chapter 14 


asserts that T also spans V. Every basis of V is a linearly independent set with n elements, yet 
T is a spanning set with only n — | elements. This contradicts Theorem 14.8, thus S is linearly 
independent. 


In light of Lemma 14.12, if we already know that a vector space has dimension n, then it is 
often quite easy to determine whether a particular subset with n elements is a basis. 


m Examples 


1. Let Py(x) be the polynomials of degree less than 4 is Q[x] and let 
S={1,x+1,x74+1,x°+1}. Then 


x=(xt+)D-1, PH=0’?4+1-1, P= (0° 4+1)-1€ span(S). 


Therefore, span(S ) contains the spanning set {1, x, Cae ail 8 By Lemma 14.5, S spans 
P4(x). However, S has 4 elements and P4(x) has dimension 4 over Q, so 
Lemma 14.12 tells us that S is a basis of P4(x). 


2. Consider the subset {(2, 0), (—4, 3)} of R’. Ifa, B € RB have the property that 
a-(2,0)+ B- (—4, 3) = (0,0), then a, 6B must satisfy the equations 
2a—46=0 and 36=0. 


It is easy to see that a = 6B = 0 is the only solution, so {(2, 0), (—4, 3)} is linearly 
independent. Since R? has dimension 2 over R and our set has 2 elements, 
Lemma 14.12 implies that {(2, 0), (—4, 3)} is a basis of R?. 


There certainly exist vector spaces V whose dimension over F is not finite. For example, 
consider the vector space Q[x] over Q. If we let S = {p1(x), po(x), .-., Pn(x)} be any finite 
subset of Q[x], then we can let m denote the largest exponent that occurs in any of the 
polynomials in S. Since the exponent x’”t! does not appear in any of the polynomials in S, it is 
clear that x”"+! ¢ span(S ). Thus, S does not span Q[x]. This tells us that no finite subset of 
Q[x] spans Q[x], so Q[x] cannot contain a finite subset that is both linearly independent and 
spans. In cases like this, we say that our vector space V is infinite dimensional over F. In the 
next result, we will see that the existence of a finite subset that spans is not only necessary but 
also sufficient for a vector space to have finite dimension. 


Lemma 14.13. Let V be a vector space over a field F. Then the dimension of V over F is 
finite if and only if V contains a spanning set which is finite. 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 555 


Proof. Clearly if the dimension of V is finite, then any basis of V is also a finite spanning set. 
In the other direction, suppose S = {v1, v2,..., Um} is a finite set that spans V. The set S 
certainly contains subsets that are linearly independent, so we can let T be a subset of S of the 
largest possible size that is linearly independent. Our goal is to show that T also spans V, so it 
suffices to show that span(T ) contains S. By reordering the elements of S, we may assume 
that T = {v1, v2,..., vg}, where k < m. If T = S, then certainly span(T) contains S. On the 
other hand, if T # S, let v € S such that v ¢ T. Observe that the set TU {v} is a subset of S that 
is larger than T. By our choice of 7, the set JT U {v} must be linearly dependent. As a result, 
there exist a;, a € F, not all of which are 0, such that 


Oy Vip tag-Vat---+ap-vpta-v=0. 


Since T is linearly independent, it must be the case that a 4 0. Therefore, we can solve for v 
and see that v can be written as a linear combination of the elements of S. Thus, v € span(T), 
so V = span(S) = span(T ). Therefore, T is a basis of V, so V has dimension k < m. 


Observe that the proof of Lemma 14.13 really shows that if T is a linearly independent subset 
of a vector space such that any set that contains T and is not equal to 7 is linearly dependent, 
then T is actually a basis. This idea will appear again several times in the next section. 


Exercises for Sections 14.2 and 14.3 


In exercises 1-12, you will be working in R*, which is a vector space over R. In each exercise, 
determine if the given set is 


(a) aspanning set for R4, 
(b) linearly independent over R, 
(c) both a spanning set and linearly independent, or 
(d) neither a spanning set nor linearly independent. 
Briefly explain your answer. 

1. {(1,2,0,0), (3,0, 1,0), (0, 0,0, 1)} 

2. {(4,1,3,0), (0,0,1,1), (4, 1,5, 2)} 

3. {(0,1,1,1), (,2,0,0), (2,0, —4, —4)} 

4. {0,1,1,1), (0,1,2,3), (0,0, 1, 1)} 

5. {(0,0,0,0), (1, 1,0,0), 1,1,1,0), d,1,1, 1} 


556 Chapter 14 


oS en aD 


10. 
11. 
12. 


{(2,2,0,0), (—1, -1, -1, -1), C1, 2,3,4), (0,0,1, 1D} 

{(4, 1,0,0), (—12, —3, 1,0), (0,0, 1,0), (0,0, 0, 1)} 
{(1, 2, 2,2), (0,1,1,1), (0,0, 1,1), (0,0, 2, 1)} 

{(3,0,1,1), G,0,2,2), (—1,0,1,—-1), @,0,3,4), 4,0, 1, 1} 
{(1,0,0,0), (0,2,0,0), (0,0,4,4), (0,0,1,2), (0,0, —1, 0)} 
{(2, 1,2, 1), (0, 1,0, 1), (0,0,0,1),d,1,1, 1), (0,3,0, 1} 
{(—1,0, 1,2), (0, 1,0, 1), (2,1,0,0), (0,0, 1,2), (0,0, 1,—1)} 


In exercises 13-24, you will be working in P3(x), the set of polynomials of degree less than 3 


with coefficients in Q. P3(x) is a vector space over Q and, in each exercise, determine if the 
given set is 


(a) aspanning set for P3(x), 


(b) linearly independent over Q, 


(c) both a spanning set and linearly independent, or 


(d) neither a spanning set nor linearly independent. 


Briefly explain your answer. 


13. 
14. 
15. 
16. 
17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 


{x+1, 2x?-5} 
{-x+14, 2x—28} 
{7x+11, 7x—11} 
{5x—35, —2x+ 14} 
{1, Xx, a} 

{x+1, x°+1, x?-x} 
{x+1, xX+2, x 243} 

{25 3x 4, 2K — 65} 
{x?7+x, x°+x41, x7 4+x42, x? +x+3} 
{x?-1, x°+1, x+1, x-1} 

{2x°+1, 4x7+x+2, 3x, 10x?-21x+5} 
{x—5, x?—25, x2-6x+5, x?—10x+25} 


An Introduction to Linear Algebra and Vector Spaces 557 


In exercises 25-30, you are given a subset of R*, which is also a vector space over R. In each 
exercise, find the dimension and a basis. Keep in mind that there are many choices for a basis. 


25. {(a,b,c,d) € R*|b=d =0} 

26. {(a,b,c,d) € R4|a=b=c=d} 

27. {(a,b,c,d) € R*|a+b=0 and c—d=0} 

28. {(a,b,c,d) € R*|b—2c+3d =0} 

29. {(a,b,c,d) € R4|2a—3b+c—8d =0} 

30. {(a,b,c,d) € R*|a=0,b = 2c, and 2b—5c+8d =0} 


In exercises 31-36, P4(x) will denote the polynomials of degree less than 4 with coefficients 
in Q. In each exercise, you will be given a subset of P4(x) that is also a vector space over Q 
and you will need to find the dimension and a basis. Keep in mind that there are many choices 
for a basis. 


31. { f(x) € Pa(x)| FO) = 0} 

32. { f(x) € Pa(x)| FS) = fU0)} 

33. { f(x) € Pa(x)| fC) = fQ)} 

34. { fx) € Pa@ I f(-1) = FO) = fl) = 0} 

35. { f(x) € Pa@)| fO) = FU) = f)} 

36. { f(x) € Pa(x)| fO) = fC), f(2) = fGB), and f(4) = fS)} 


In exercises 37-38, let S = {v1, v2,..., Un} C V, where V is a vector space over a field F. 


37. Show that span(S ) is also a vector space over F. 


38. Show that the dimension of span(S) over F is at most n and the dimension is exactly n if 
and only if S is linearly independent. 


In Section 14.1, we indicated that if F C K are fields then K is always a vector space over F’. 
In exercises 39-46, we examine the more general situation where F C R, F is a field, and R is 
a commutative ring. 


39. Let R= {(a, b)|a, b € Q} and let F = {(a, 0)|a € Q}. Observe that R is a commutative 
ring, where both addition and multiplication are done componentwise, and F is a field. 
Since R is a commutative ring, the first three properties of Definition 14.1 are satisfied. 
Prove that R is not a vector space over F' by showing that property (4) of Definition 14.1 
does not hold. 


558 Chapter 14 


40. 


41. 


42. 


43. 


44, 


45. 


46. 


47. 


48. 


49. 


Let R = {(a, b)|a, b € Q} and let K = {(a, a)|a € Q}. Similar to the situation in 
exercise 39, R is a commutative ring, K is field, and the first three properties of 
Definition 14.2 are automatically satisfied. Prove that R is a vector space over K by 
showing that property (4) of Definition 14.1 does hold. 


Let F C R, where F is a field and R is a commutative ring. If we let e denote the 
multiplicative identity of F, show that R is a vector space over F if and only if e is also 
the multiplicative identity of R. Then examine how this result applies to the examples in 
exercises 39 and 40. 


Suppose F’ C R, where F is a field and R is an integral domain. Show that R is a vector 
space over F. 


Let F C R, where F is a field, R is a commutative ring, and R is a vector space over F. 
Suppose r € R and p(t) = x" +ay_1x""!+---a,x+ao € F[t] have the properties that 
p(r) = 0 and p(®) is the monic polynomial of smallest degree in F[t] which has r as a 
root. (Note that when plugging r into p(f) or any other element of F[t], the computations 
are done within the ring R.) 

(a) Show that if ag = 0, then r is a zero divisor in R. 


(b) Show that if ag ~ 0, then r has a multiplicative inverse in R. 


Let F C R, where F is a field, R is a commutative ring, and R is a vector space of 
dimension n over F.. Show that for every r € R, there exists a monic p(t) € F[t] of 
degree at most n such that p(r) = 0. 


In Chapter 7, we showed that if a ring is finite, then every nonzero element is either 
invertible or a zero divisor. We also saw that this does not hold for rings in general. Prove 
that if F C R, where F is a field, R is a commutative ring, and R is a finite dimensional 
vector space over F’, then every nonzero element of R is either invertible or a zero divisor. 


Suppose FC R, where F is a field and R is an integral domain. In light of exercise 42, 
we know that R is a vector space over F’. Show that if R is finite dimensional over F, 
then R is a field. 


Let F C K be fields. Suppose r € K such that r is the root of some nonzero polynomial in 
F[t] and let n denote the degree of the minimum polynomial for r over F’. Show that the 
set {is ry cas; ae, is linearly independent over F. 


Use exercise 47 to show that the set {1, 27 ; 27, 27 ; 27 ; 27, 27) is a subset of R which is 
linearly independent over Q. 


Let r = J/24+ V3. 


(a) Show that the minimum polynomial for r over Q has degree 4. 


50. 


51. 


52. 


53. 


54. 


55. 


56. 


37. 


58. 


59. 


An Introduction to Linear Algebra and Vector Spaces 559 


(b) Compute r? and r? and then consider the sets S = {its rr’, rt and T = ae Noe 
J3 : /6 }, both of which are subsets of IR. Show that S and T have the same span 
over Q. 


(c) Use exercises 37 and 38 to show that if span(T ) represents the span of T over Q, 
then span(T ) has dimension 4 over Q. 


If Q(V2) = {a +b/2I\a,b€ Q}, use exercise 49 to show that /3 ¢ Q(V2) and 
V6 ¢Q(2). 


If Q(V3) = {a +bV/3|a,b€ Q}, use exercise 49 to show that /2 ¢ Q(V/3) and 
V6 ¢Q(V3). 
If Q(/6) = {a+ bv/6la, b € Q}, use exercise 49 to show that /2 ¢ Q(./6) and 
V3 ¢Q(v6). 


Let s= J/5+ 7. 


(a) Show that the minimum polynomial for s over Q has degree 4. 


(b) Compute s? and s? and then consider the sets S = {1, s,s”, s°} and 
T= (1, Mf SAL TRY 35}, both of which are subsets of R. Show that S and T have the 
same span over Q. 


(c) Use exercises 37 and 38 to show that if span(T ) represents the span of T over Q, 
then span(T ) has dimension 4 over Q. 

If Q(/5) = {a+ bV5la, b € Q}, use exercise 53 to show that /7 ¢ Q(/5) and 

V35 ¢ Q(5). 

If Q(/7) = {a+ bV/7]la, b € Q}, use exercise 53 to show that //5 ¢ Q(/7) and 

V35 €Q(7). 

If Q(/35) = {a+ bV/35|a, b € Q}, use exercise 53 to show that /5 ¢ Q(./35) and 

V7 ¢ Q(V35). 


Show that the set S = (20 |0 <i < 11}, which is a subset of R, is linearly independent 
over Q. 


Let K be a field such that Q C K C Rand K is finite dimensional over Q. Use 
exercise 57 to show that if K contains both 2? and On then the dimension of K over Q 
must be at least 12. 


Show that the set S = (335 |0 <i < 34}, which is a subset of R, is linearly independent 
over Q. 


560 Chapter 14 


60. Let L be a field such that Q C L C R and L is finite dimensional over Q. Use exercise 59 
to show that if L contains both 35 and 37, then the dimension of L over Q must be at 
least 35. 


61. Let V be a vector space of dimension n over the finite field F', where F has q elements. 
How many elements does V have? 


14.4 Subspaces and Linear Equations 


We can now begin to look more closely at the connection between vector spaces and linear 
equations. 


Definition 14.14. If V is a vector space over a field F, then a subset U of V is called a 
subspace if 


(a) uytu2 EU, forall uy,uz € U and 
(b) a-ueU, forallae Fandue€U. 


Observe that if U is a subspace of V, then property (a) from Definition 14.14 tells us that we 
can always add elements of U and remain in U. Furthermore, property (b) from 
Definition 14.14 tells us that 


0-u=0, (-1)-u=-—ueU, 


for all u € U. Therefore, U contains the additive identity of V and also contains the additive 
inverse of each of its own elements. Since addition in U inherits being associative and 
commutative from V, U is also a commutative group under addition. Furthermore, property 
(b) from Definition 14.14 tells us that we can always multiply an element of U by a scalar and 
remain in U. Scalar multiplication in U inherits properties 1-4 of Definition 14.1 from V, so U 
now Satisfies all the properties of a vector space. As a result, any subspace of V is also a vector 
space over F’. 


m= Examples 


1. If V isa vector space over F’,, then the sets {0} and V are both subspaces of V. Since 


{0} does not contain any nonzero vectors, we consider it to have dimension zero over 
F. 


2. If Sis any finite subset of V, then span(S ) is a subspace of V. 


3. If we let P, (x) be the polynomials of degree less than n in R[x], then we have 


P(x) © P2(x) © P3(x) © Pa(x) © --- 


An Introduction to Linear Algebra and Vector Spaces 561 


Each P,,(x) is a subspace of R[x]. In fact, ifm <n then P,,(x) is a subspace 
of P,,(x). 


One would suspect that the dimension of a subspace cannot exceed the dimension of the 
original vector space. In fact, we have 


Proposition 14.15. Let V be a vector space over a field F. 


(a) If V has dimension n over F and if U is a subspace of V, then the dimension of U is at 
most n. Furthermore, U = V if and only if U also has dimension n over F. 


(b) IfU,, Uz,...,Um are subspaces of V, then the intersection Uj NU20---A Um is also a 
subspace of V. 


Proof. For part (a), since V has dimension n, any linearly independent subset of U also has at 
most n elements. Since U certainly contains subsets which are linearly independent, we can let 
S be a linearly independent subset of U which is as large as possible. The argument used in the 
proof of Lemma 14.13 shows that S is a basis of U. Since S can have at most n elements, the 
dimension of U is at most n. 


For the second piece of part (a), if U = V, then clearly U also has dimension n. On the other 
hand, if U has dimension n, then any basis S of U is a linearly independent subset of V with 
n elements. By Lemma 14.12, S spans V, so U = span(S ) = V, as desired. 


For part (b), since each Uj is a subspace, it follows that if a € F and 
uj,u2 EU; NU2N::-NU pm, 
then uw; + uz and a- uy, belong to each of the U;’s. Thus, 
uy tu2,a-uy €U;~,NU2N::--NU 


and we see that U; 1U2™M---1U satisfies the two conditions of Definition 14.14. As a result, 
U;NU2N---AU » is also a subspace. 


In the next lemma, if (a@), @2,...,@,) € F”, we define the function 
T:F"°>F 
as 
T(K) = 0 xy +02-X24+-++On Xn, 


for all X = (x1, X2,...,Xn) € F". For example, using the element (3, —2) € IR’, we can define 
the function T : R? > Ras T((x1, X2)) = 3x1 — 2x2, for all (x1, x2) € R?. 


562 Chapter 14 


Lemma 14.16. Jf (a1, @2,...,@n) € F”, then 
(G49) =16) 470) and Te Se TO), 
for all X = (x1, X2,...,Xn), Y= 1, V2, «++ Yn) € F" and a € F. Furthermore, the set 
U = {x € F"|T(X) = 0} is a subspace of F". 
Proof. Observe that 


TR+Y) = T(x + y1,X2 + y2,---,Xn t+ Yn)) = 
4 (x1 + y1) +2 (x2 + y2) +++ On (Xn +n) = 
(ary xy Hag x2 +++ + Ay Xn) + (Q1 + YI + O2- Y2+++++On- Yn) = TH) +TO). 


Similarly, we also have 
T(a +X) = T(x], 0X2, ...&Xn)) = 1 (@X1) + O2(@X2) +++ +n (Xn) = 
OL(OL] +X FOLD + XQ +++ +Oy + Xp) =a-T(X). 
To see that U is a subspace, if x, y € U anda € F, it follows that 


T(x+ y) = T(x) + TH) =0+0=0 


and 

T(a-X) =a-T(x) =a-0=0. 
Thus, x+ y,a@-x € U and U is indeed a subspace of F”. O 
m Examples 


1. Let T:R* > R be the function T((x1, x2)) = 4x; +xo, forall (x1, x2) € R*. Then 
U = {(x1, x2) € R?|4x; +. x2 = 0} is a subspace of R?. Since U # R?, the dimension of 
U over R is less than 2. On the other hand, since (1, —4) € U, the dimension of U 


over R is at least 1. Combining these two facts, U has dimension | over R and the set 
{(1, —4)} is a basis. 


2. Let T:R?— R be the function T ((x1, x2, X3)) = x1 + 2x2 — 443, for all (x1, x2, x3) 
€ R?. Then U = {(Q1, x2, x3) € R3 |x; + 2x2 — 4x3 = 0} is a subspace of R?. Since 
U £R?, the dimension of U over R is less than 3. On the other hand, the set 
{(4, 0, 1), (—2, 1, 0)} is a linearly independent subset of U. Thus, the dimension of U 
over R is at least 2. Combining these facts, U has dimension 2 over R and 
{(4, 0, 1), (—2, 1, 0)} is a basis. 


www-.elsevierdirect.com 


An Introduction to Linear Algebra and Vector Spaces 563 
We can begin applying the theory developed in this chapter to linear equations. 
Corollary 14.17. Let U be the set of n-tuples X = (x\,X2,...,Xn) € F" such that 


yy Xp HQ. X2++++ + Qin Xp =O, 


091° Xy $O22-Xa +++ +A -Xy_ =O, 


Om °X1 +Om2*X2+-++ + Amn Xn = 0, 


where a; € F and 1 <i<m,1< j <n. ThenU isa subspace of F". Furthermore, if the field 
F is infinite, then U is either an infinite set or consists only of the single element 
0 = (0,0,...,0). 


Proof. For every i < m, let T; : F” — F be the function 
T;(X) = ox + onj9%2 +--+ +O Xn, 
for all x € F”. In addition, let 
U; = {x € F"|T;&) =0}. 


Then, by Lemma 14.16, each U; is a subspace of F”. Therefore, by Proposition 14.15, 
U;NU2M---AU»y is also a subspace of F”. However, U = U;NU2N---NU, so U is also a 
subspace of F”. 


Finally, suppose U contains an element x 4 0. By Lemma 14.2(d), a-x is a different element 
of F”, for each different w € F. Thus, if F is infinite, then {a -x|a € F} is a subset of U that is 
also infinite. As a result, if F is infinite and U contains an element other than 0, then U must 
also be infinite. 


We can now consider an example where our field is finite. 


m= Examples 
Let T : Z5> > Zs be the function 
T((x1, x2, ¥3)) = [3] 5x1 + [2]5x2 + [1]5x3, 


where (x1, X2,.%3) € Zs>. Then 


U = {(x1, 42,43) € Zs I[3]sai + (2]5x2 + [1153 = [0]5} 


564 Chapter 14 


is a subspace of Zs*. Observe that the set {([1]s, [1]s, [O]s), ([1]s, [O]s, [2]s)} is a basis 
for U. Therefore, every element of U can be written uniquely as 


a: ([1]5, [1]s, [0]s) + 6- (1s, [0]s, (21s), 


where a, B € Zs. Since there are 5-5 = 25 different pairs of elements (a, 8), where 
a, B € Zs, we see that U contains exactly 25 elements. 


The collection of equations displayed in Corollary 14.17 is often referred to as a system of 
linear equations. In these equations, x), x2,..., Xx, are referred to as variables or as 
unknowns. Therefore, we say that Corollary 14.17 deals with a system of m linear equations 
in n unknowns. If the right side of every equation in a system of linear equations is 0, we say 
that the system is homogeneous. 


Given a system of homogeneous linear equations, it is clear that 0 is always a solution. 
However, a system of homogeneous linear equations need not have any solutions other than 0. 
For example, the system of equations 


xXjtx.=O0 and x,+2x.=0 


certainly has no solution other than O= (0, 0). This system consists of two equations and two 
unknowns. An important application of Theorem 14.8 is that if the number of unknowns 
exceeds the number of equations, then a system of homogeneous linear equations must have a 
solution other than 0. Deriving this fact from Theorem 14.8 will require an interesting 

change in perspective, as we will temporarily need to think of the x;’s as scalars and not as 
unknowns. 


Theorem 14.18. Consider the following system of m homogeneous linear equation in 
n unknowns: 


Oy, Xp + Qy2- Xa +--+ +Qjy-X%y, =O, 


191 Xy $022 -X2++++ + Qn + Xp =O, 


Om X1 +Om2-X2 ++ +++ Amn: Xn = 0, 


where a;; € F and 1 <i<m,1< j <n. Ifthe number of unknowns exceeds the number of 
equations, then there exist solutions other than 0 in F". 


An Introduction to Linear Algebra and Vector Spaces 565 


Proof. Given a;; € F, where 1 <i <m,1 <j <n, let 


Q = (011, O21, ..., @m1), Az = (012, 22, ..., m2), ---5 
An = (Qin, Hon, ---, Amn). 
Since each a; is an m-tuple of elements of F, we can consider {d@, a2, ..., G} to be a set of 


n vectors belonging to the vector space F’””. 


Since F” has dimension m over F andn > m, Theorem 14.8 asserts that the set {a1, do, ..., 
an} must be linearly dependent. Therefore, there exist x;, x2,...x, € F, not all of which are 0, 
such that 


(12) ee a eae ee ee 


It is important to keep in mind that both sides of equation (12) belong to F”. Therefore, we 
can rewrite equation (12) in a form that explicitly shows the m components of both sides of 
equation (12). Since this is an essential part of the proof, you should be careful to check that 
you understand where all these components come from. Equation (12) can now be rewritten as 


(1, Xp O12 XQ +++ FO Xp, C21 -Xy + Ol2Q+ XQ++++ + Oy Xp,.-., 


Omi X1 + Am2>X2++++ + Amn +X) = (0,0,...,0). 


Since each component in the preceding equation is equal to 0, it is now clear that x = (x1, 
X2,...,X,) is a solution to all m homogeneous linear equations. Furthermore, since at least 


5 
one of x1, X2,...Xy is not zero, x AO in F”. 


We now turn our attention to systems of linear equations that are not necessarily 
homogeneous. Consider the following system of linear equations: 


Oy, Xp tay -X2++++ + An Xn = Bi, 


021° Xp +022 -X2 +--+ + An + Xn = Bo, 


Omi X1 + Om. X2+++>+Omn Xn = Bm, 


566 Chapter 14 


where aj; ¢ F, Bj € F, and 1 <i<m,1< j <n. In this case, we refer to the homogeneous 
linear equations 
Oy, Xy HOy2Q-X2+-+-+Qjy- Xp =O, 


121 Xp + O22-X2++++ + Qn + Xp =O, 


Omi» X1 +m * X2 +++ +Omn + Xn =O 


as the corresponding system of homogeneous linear equations. In our next result, we will 
show that in order to find all the solutions of a system of linear equations, it suffices to find 
one solution provided you can also find all the solutions of the corresponding system of 
homogeneous linear equations. 


Corollary 14.19. Let W be the set of n-tuples X = (x, X2,..., Xn) such that 


Oy Xy tay. -X2 +++ +Qjn- Xp = Bi, 


0021 +X +022 -X2 +--+ +27 + Xy = Bo, 


Om 1° X1 + Om? X2+++>+Omn Xn = Bm, 


where aj; € F, Bj € F, and 1 <i<m,1<j <n. Ify€ Wand U is the set of solutions of the 
corresponding homogeneous linear equations, then every element in W is of the form y + u, 
where u € U. Conversely, if y € W and u € U, then y+u € W. Furthermore, if the field F is 
infinite and W contains more than one element, then W is an infinite set. 


Proof. As in the proof of Corollary 14.17, for every i < m, we let 7; : F” — F be the function 
T;(X) = oj1x1 + fj2X2 +--+ OinXn, 


for all x € F”. Observe that x € W if and only if 7;(x) = B;, for all i < m. 


Now suppose y is a fixed element of W. If x € W, by applying both parts of Lemma 14.16, we 
see that 


Ti(X— y) = T+ (—Dy) = Ti) + T(-D y) = T:) — T:0) = Bi — Bi = 9, 


for all i < m. This means that x — y € U, sox — y =H, for some u € U. Thus, xX = y+u. 


An Introduction to Linear Algebra and Vector Spaces 567 


Conversely, if y € W and u € U, then it follows from Lemma 14.16 that 
T;9+#) =T,0)+T,@ = 6 +0= Bi, 


for all i < m. Thus, y+u € W. 


In addition, if W has more than one element, let x, y € W such that x 4 y. Then our preceding 
argument shows that u = x — y is a nonzero element of U. Using Corollary 14.17, if F is 
infinite, then so is the set U. Since F” is a group under addition, a now familiar argument tells 
us that if y ¢ W and wi, v2 € U such that u, 4 2, then y+ 4, 4 Y+Up. Therefore, if U is an 
infinite set, so is W. 


m Examples 


1. In R, consider the linear equation 2x; + 3x2 = 7. In this case, the corresponding 
homogeneous linear equation is 2x; +3x2 =0. The set of solutions of this 
homogeneous linear equation is a subspace of R? of dimension less than 2. Since 
(3, —2) is a solution to this equation, the solutions of this equation are a 
1-dimensional subspace of R? with basis {(3, —2)}. It is easy to see that (2, 1) isa 
solution of 2x; +3x2 = 7. Therefore, Corollary 14.19 tells us that all solutions of 
2x; + 3x2 =7 are of the form 


(2,1) +a- (3, —2) = (2+3a, 1 — 2a), 


wherea ER. 


2. In Q, consider the linear equation 5x; — 2x2 +4x3 = 9. The set of solutions of the 
corresponding homogeneous linear equation 5x; — 2x2 + 4x3 =0 must bea 
subspace of Q? of dimension less than 3. On the other hand, (2,5,0) and (0, 2, 1) 
are two linearly independent solutions of 5x; — 2x2 +4x3 = 0. Thus, the solutions of 
5x1 — 2x2 +4x3 = 0 are a 2-dimensional subspace of Q? with basis {(2, 5, 0), 

(0, 2, 1)}. Furthermore, (1, 0, 1) is a solution of 5x; — 2x7 +4.x3 = 9. Therefore, 
Corollary 14.19 asserts that all solutions of 5x; — 2x2 +4x3 = 9 are of 
the form 


(1,0, 1) +a@-(2,5,0)+ B- CO, 2, 1) = (1+ 2a, 50+ 26, 1+ 6), 


where a, B € Q. 


www.elsevierdirect.com 


568 Chapter 14 


Exercises for Section 14.4 


In exercises 1-20, you will be examining various subsets of R[x]. In each case, determine if 
the set is a subspace of R[x] over R. If the set is not a subspace, provide a reason. If the set is a 
subspace and is also finite dimensional, compute the dimension. 


1. 


2 
3 
4. 
5 
6 


Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 


Polynomials in 
is 0. 


Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 
Polynomials in 


Polynomials in 
the same. 


Polynomials in 
Polynomials in 
Polynomials in 


Polynomials in 


R[x] of degree less than 10. 

R[x] of degree less than 10 with a constant term of 0. 
R[x] of odd degree. 

R[x] of even degree. 

R[x] where the coefficient of every even power of x is 0. 


R[x] of degree less than 8 where the coefficient of every even power of x 


R[x] of degree less than 7 that are also multiples of x°. 

R[x] of degree 1. 

R[x] which have no real roots. 

R[x] of degree less than 5 that have no real roots. 

R[x] where the coefficient of x° is 0. 

R[x] of degree less than 5 where the coefficient of x is 0. 

R[x] where the coefficient of x° is not 0. 

R[x] where the coefficients of x and x” are different or are both equal to 0. 
R[x] where the coefficients of x, x7, and x? are all the same. 


R[x] of degree less than 7 where the coefficients of x, x’, and x? are all 


R[x] that are multiples of ae oe 
R[x] that are multiples of x? + 1 and have degree less than 20. 
R[x] that are monic. 


R[x] that have more than one real root. 


Let V be a vector space over a field F of dimension n. If m € N and m <n, show that V 
contains a subspace of dimension m over F. 


An Introduction to Linear Algebra and Vector Spaces 569 


In exercises 22—26, we generalize some of the ideas from Lemma 14.16. Let V, W be vector 
spaces over a field F' and let T: V > W be a function such that T(v; + v2) = T(v1) + T(v2) 
and T(a-v,) =a@- T(v,), for all vj, v2 € V and @ € F. We call 7 a linear transformation. 
Since V and W are groups under addition, observe that T is a homomorphism of groups that 
also preserves scalar multiplication. 


22. Ifvy,...,U, € Vanday,...,a@, € F, show that T(@,-v; +---+a,-v,) = 
ay > T(vy) +--+ +On + Tun). 


23. Let Ker(T) = {v € V|T(v) = 0}; show that Ker(T) is a subspace of V. 
24. Let Im(T) = {w € W| there exists v € V such that w = 7(v)}; show that Jm(T) is a 


subspace of W. 
25. Suppose w1, w2,..., Wy, € Im(T) are linearly independent in W and let 
V1, V2,..., U2 € V such that T(v;) = w;, for 1 < i <n. Show that the set {v), v2,..., un} 


is linearly independent in V. 


26. Suppose V has dimension n over F’. 
(a) Show that Jm(T) has dimension less than or equal to n. 


(b) If we let m denote the dimension of Jm(T), show that there exists a linearly 
independent subset {v1, v2, ..., Um} of V such that {7(v1), T(v2),..., Tm) } isa 
basis for Im(T). 


(c) Show that Ker(T) has dimension less than or equal to n. 


(d) If we let / denote the dimension of Ker(T) and let {u1, u2,..., uj} be a basis for 
Ker(T ), show that {v1, v2,..., Um, U1, U2,..., uj} is a basis for V. 


(e) Conclude that if V is finite dimensional, then the dimension of V is equal to the 
dimension of Jm(T) plus the dimension of Ker(T). 


In exercises 53 and 54 following Section 8.1, you saw that a group cannot be the union of two 
proper subgroups. Since every vector space is also a group under addition, it immediately 
follows that a vector space cannot be the union of two proper subspaces. In exercises 27-30, 
we will prove the much stronger result that a vector space cannot be the union of any finite 
number of proper subspaces, provided the underlying field is infinite. 


27. Let V be a vector space over an infinite field F and let Uj, U2, ..., U, be subspaces of V. 
Suppose u, v € V such that u+a@-v € U;UU2U---UUj, for all a € F. Show that there is 
some i <n such that both u and v belong to Uj. 


28. Let Uy, U2,..., Un, Un41 be subspaces of a vector space V over an infinite field F’. If 
u,v € V such that u ¢ U; UU2U---UU,, and v ¢ Un+1, show that there is some a € F 
such that u+a-v €U;UU2U---UU, UU 41. 


570 Chapter 14 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


36. 


Let V be a vector space over an infinite field F and let U;, U2, ..., U, be subspaces of V, 
none of which is equal to V. Show that V A U; UU2U---UU,,. If you try to prove this 
using Mathematical Induction, you should first take a look at exercises 27 and 28. 


Let V be a vector space over an infinite field F and let Uj, U2,..., U, be subspaces of V. 
If U; UU2U---UU, is a subspace of V, show that there is some m <n such that U;, 
contains every U;, where i <n. 


The solutions of 
3x+8y+2z=0 


are a subspace of R>. Find the dimension and a basis for this subspace. 


The solutions of 
5x —6y+5z=0 
are a subspace of R°. Find the dimension and a basis for this subspace. 


The solutions of 


4x —Sy+2z=0 
7x+10y—3z=0 


are a subspace of R*. Find the dimension and a basis for this subspace. 


The solutions of 


2x+3y+2z=0 
—4x+9y—z=0 


are a subspace of R*. Find the dimension and a basis for this subspace. 


The solutions of 
—x+3y+2z+llw=0 


are a subspace of R*. Find the dimension and a basis for this subspace. 


The solutions of 
5x —8y+10z-—7w=0 


are a subspace of R*. Find the dimension and a basis for this subspace. 


37. 


38. 


39. 


40. 


41. 


42. 


43. 


An Introduction to Linear Algebra and Vector Spaces 


The solutions of 
llx—5y+6z+2w=0 
6x—4z+5w=0 
are a subspace of R*. Find the dimension and a basis for this subspace. 
The solutions of 
—7x+4y+ 18z-—Sw=0 
4x—y+9w =0 
are a subspace of R*. Find the dimension and a basis for this subspace. 
The solutions of 
x—4y+2z+3w=0 
6x+2y—z+5w=0 
—Sy+7w=0 
are a subspace of R*. Find the dimension and a basis for this subspace. 
The solutions of 
7x+4y—8z-10w =0 
—4x+3y+5z+w=0 
—x+1ly—14z+2w=0 
are a subspace of R*. Find the dimension and a basis for this subspace. 
Find a solution of 


3x+8y+22=3 


in R?. Then use your answer from exercise 31 to find all solutions of this equation. 


Find a solution of 


5x —6y+5z=-—12 


in R?. Then use your answer from exercise 32 to find all solutions of this equation. 


Find a solution of 
4x —5y+2z = —90 
7x+10y—3z=7 


571 


in R?. Then use your answer from exercise 33 to find all solutions of these equations. 


572 Chapter 14 


44. 


45. 


46. 


47. 


48. 


49. 


50. 


Find a solution of 
2x+3y+2z=5 
—4x+9y—z=50 
in R?. Then use your answer from exercise 34 to find all solutions of these equations. 
Find a solution of 
—x+3y+2z+1lw = 32 
in R*. Then use your answer from exercise 35 to find all solutions of this equation. 
Find a solution of 
5x —8y+10z—7w = 73 
in R*. Then use your answer from exercise 36 to find all solutions of this equation. 
Find a solution of 
llx—S5y+6z+2w = 14 
6x —4z7+5w = —6 
in R*. Then use your answer from exercise 37 to find all solutions of these equations. 
Find a solution of 
—7Tx +4y + 18z —Sw = —60 
4x —y+9w = 86 
in R*. Then use your answer from exercise 38 to find all solutions of these equations. 
Find a solution of 
x—4y+2z7+3w=5 
6x+2y—z+5w=69 
—Sy+7w =29 
in R*. Then use your answer from exercise 39 to find all solutions of these equations. 
Find a solution of 
7x +4y—8z—-10w = 31 
—4x+3y+5z+w=-9 
—x+lly—14z+2w = 149 


in R*. Then use your answer from exercise 40 to find all solutions of these equations. 


Degrees and Galois Groups of 
Field Extensions 


In Chapter 16, we will prove that 60° angles cannot be trisected with a ruler and compass. To 
do this, for every real number that can be constructed, we will associate a field K. It will turn 
out that K is a finite dimensional vector space over Q, and we will need to determine the 
dimension of K over Q. 


Then, in Chapter 17, we will prove the insolvability of the quintic. In this situation, to every 
polynomial f(x) € Q[x], we will associate a field L such that L is also finite dimensional over 
Q. However, not only will we be interested in the dimension of L over Q, but we will be even 
more concerned with the structure of the Galois group Gal(L/Q). 


As we can see, fundamental and concrete questions in abstract algebra can often be reduced to 
questions involving fields, dimensions of vector spaces, and groups of automorphisms. Since 
the dimension of a field compared to a subfield is often related to the degree of a polynomial, 
we adopt the following terminology: 


Definition 15.1. Jf F C K are fields, we call K a field extension of F. The dimension of K as 
a vector space over F is often referred to as the degree of K over F and denoted as [K : F']. 
Furthermore, if [K : F] is finite, we say that K is a finite extension of F. 


In light of our goals and the fact that we now often use the word degree in place of dimension, it 
is clear that we need to turn our attention to the degrees and Galois groups of field extensions. 


15.1 Degrees of Field Extensions 


Most of the fields we will deal with are related to roots of polynomials in Q[x]. Having proven 
the Fundamental Theorem of Algebra in Chapter 6, we know that all these roots belong to C. 
Therefore, virtually all of our fields K will have the property that Q C K CC. Proofs of the 
Fundamental Theorem of Algebra that rely almost entirely on algebra require an enormous 
amount of algebraic machinery. Therefore, in most textbooks, a proof of the insolvability of 
the quintic appears before a proof of the Fundamental Theorem of Algebra. That approach 
makes developing the Galois Theorem needed to prove the insolvability of the quintic very 


Copyright © 2010 by Elsevier Inc. All rights reserved. 573 


574 Chapter 15 


abstract. However, we have the Fundamental Theorem of Algebra at our disposal and therefore 
know that the relevant fields lie between Q and C. This will help in making our discussion of 
the Galois Theorem and the insolvability of the quintic relatively concrete. 


In order to be a field, a set and its two operations need to satisfy nine axioms. Therefore, 
checking if a set is a field can sometimes be quite tedious. That can make it difficult to 
generate a large collection of examples. However, the next result will make it quite easy to 
generate examples of fields that lie between Q and C. 


Lemma 15.2. Let QC K CL CC be such that K is a field and L is a finite dimensional 
vector space over K. Then L is a field if and only if it is closed under multiplication. 


Proof. One direction is clear, for if Z is a field, then it must be closed under multiplication. 
The heart of this result is the other direction. By virtue of being a vector space, L auto- 
matically satisfies all the field axioms that deal solely with addition. Furthermore, since L is a 
subset of C, multiplication in L satisfies the associative and commutative laws. For the same 
reason, the distributive laws are also satisfied. In addition, since L contains Q, it contains a 
multiplicative identity. Therefore, even without using the fact that L is finite dimensional over 
K, we have succeeded in showing that L is a commutative ring. 


At this point, in order to be a field, all that remains is to show that every nonzero element of L 
has a multiplicative inverse. It is easy to jump the gun here. Since L is contained in C and C is 
a field, it is indeed the case that every nonzero element of L has a multiplicative inverse that is 
contained in C. But that is not enough. We need that the multiplicative inverse belongs to L. 
This is where we use that L is finite dimensional over K. 


Suppose b € L is nonzero; we know b~! € C, and we need to show b7! € L. Let n denote the 
dimension of L over K and then consider the elements 1, b, b,...,b” € L. Since any 
collection of more than n elements of L must be linearly independent over K, there exist 

a; € K, not all of which are zero, such that 


ag tayb +--+ +a,—1b""!+a,b" =0. 


Let s be the smallest integer such that a, £0. Note that s > 0 and s <n. For every i, we can let 
yi = —a,~'q;, then multiplying the preceding equation by a,~'(b~!)°*! results in 


bo! = Ye41 — Ys42b— +++ Ypb" | = 0. 
This can be rewritten as 


bl = Vs41 tYVs-2b+-+++ yb 1, 


so b~! is a linear combination over K of elements of L. Hence, b~! € L, as desired. 


Degrees and Galois Groups of Field Extensions 575 


Observe that in order to use Lemma 15.2, we do not need to know the dimension of L, but we 
merely need to know that it is finite dimensional over K. Lemma 14.13 indicated that for L to 
be finite dimensional, it suffices to have a finite spanning set. Thus, in many of our 
applications of Lemma 15.2, L will initially be described as the span over K of some finite 
subset of C. In order to show that L is a field, it will suffice to show that L is closed under 
multiplication. But since L is the span of a finite set, this will merely require that the product 
of any two elements of the spanning set also belongs to L. This is typically a fairly 
straightforward computation. Thus, it will be easy to generate many examples of fields. Also, 
in many of our applications of Lemma 15.2, we will be in the special case where K = Q. 


m Examples 


1. Let Q(73) = {a+ BV3|a, B € Q}. Since Q(V3) is the span over Q of the elements 
1, /3 €C, it is clear that Q(./3) is a vector space over Q of dimension at most 2. 
Our spanning set {1, 3} has only two elements, and the product of any of them 
remains in Q(/3). Therefore, Q(/3) is indeed a field. 


Before leaving this example, we should mention that although the proof of 

Lemma 15.2 indicates that every nonzero element of Q(V3) has a multiplicative 
inverse in Q(/3), the proof does not provide a particularly efficient way of finding it. 
In fact, to find the inverse of a+ BV3, it is more helpful to note that 


(«+ Bv3) (a _ BV3) = a” — 3p’. 


Since /3 €Q, you can check that ifa+ BV/3 #0, then0 4a? —3f7 €Q. Dividing 


the previous equation by a? — 36? now results in 


Therefore, Z3p 


2. Let Q(V3, i) = {a+ bila, b € Q(V3)}. Observe that Q(V3, i) is the span of the set 
{1, i} over the field Q(/3) from the previous example. Thus, Q(V3, i) is a vector 
space over Q(/3) of dimension at most 2. However, it is also quite easy to see that 
the product of any elements of the spanning set {1, i} lies in Q(V73, i). Thus, Q(/3, i) 
is indeed a field. At this point, there are several ways to interpret the meaning of the 
symbol Q(/3, i). However, when we arrive at Theorem 15.4, we will see that all 
possible interpretations yield the same set. 


7 aap V3 Q(V3) is the multiplicative inverse ofa+ pV3. 


3. Let Q(73) = fo + B73 + yTila, B, vy € Q}. Then Q(73) is the span over Q of the finite 
set {1,73, 73}. Since the product of any two elements from the spanning set belongs 


www.elsevierdirect.com 


576 Chapter 15 


to Q73 Nis we see that Q(73) i is a field. Note that the set V= {a+ BT: la, B € Q} is not 
a field, as 73 belongs to the spanning set but 73 =73 ¢ V. We should point out 
that 73 certainly does not look like an element of V, but it does requires some work 
to verify that 73 cannot be written in the form a+ p73, where a, B € Q. However, for 
the moment, we will accept that 73 ¢ V, as we will soon prove a more general result 
that easily covers this situation. 


4. Let Q(V2+ V5) = {at b(V24+ V5) +.(V24+ V5)* +d(V24+ V5) Ia, b, c,d € Q). 
Since Q(/2+ /5) is the span over Q of the finite set 


{1, V24 V5, (V2+5)’, (v2+~V5)', 


to see that Q(./2+ V5) is a field, all we need to do is to show that the product of any 
elements of the set (1, /2+ V5, (/24 V5)”, (/24+ V/5)7} remains in Q(/2+4 V5). 


Before doing the necessary computations, you might wonder why the largest 
exponent of /2 + /5 that appears in our spanning set is 3. Observe that if we let 
w= J/2+5, then w? =7+2V/10, so (w? —7)? = 40. We can rewrite this fact as 


w' = 14w?—9. 


In other words, 
(v2+v3) =14 (vi+v5) - 


which tells us that (/2 + /5)* € Q(./2+4+ V5). Furthermore, our preceding work also 
tells us that 


w= 14w? —9w. 
Going one step further, we obtain 


w’ = 14w* —9w* = 14(w? — 9) —9w”. 


These equations tell us that (/2+4/5), (/2+4/5)® € Q(./2+ 5). Thus, the 
product of any elements from our spanning set is in Q(/2+ 5), so Q(/2+ 5) is 
a field. After doing these computations, we can see the reason that 3 was largest 
power of /2 + ¥/5 that needed to be in the spanning set followed from the fact that 
J/2+4/5 satisfied a polynomial of degree 4 in Q[x]. This enabled us to write larger 
powers of /2 + J/5 as linear combinations of elements from our spanning set. As we 
shall see, in Theorem 15.3, this is a special case of a very important and more general 
fact. 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 577 


5. Let Q(cis(F)) = 4 a;(cis(Z))/|a; € Q}. The set Q(cis(Z)) is spanned over Q by 
the finite set {(cis(Z))/|0 < j < 9}. By DeMoivre’s Theorem, (cis(Z))!° = 1. Note that 
the product of any two elements of our spanning set is of the form (cis(Z))", where 
n > 0. Applying the division algorithm, ifr is the remainder after n is divided by 10, 
then 0 <r < 9 and we have 


. (T\\" . (F\\F _ (0 
(<is(5)) =(es(G)) © @(as(5)). 
Therefore, the product of any elements of our spanning set does belong to 
Qicis(F)), so Q(cis(Z)) is a field. 
= a 


In the preceding five examples, we used Lemma 15.2 to show that various sets were fields. At 
that point, we were not terribly concerned with computing the degrees of these fields over 
smaller fields. In particular, in the last example we were content to merely say that Q(cis(F)) 
is a field, and its degree over Q is at most 10. It turns out that the degree of this extension is 
significantly smaller than 10. Computing degrees of field extensions can often require a long 
series of computations. However, we now present a result that, in many cases, will make it 
much easier to find the degree of a field extension. It will be stated for fields more general than 
those that lie between Q and C, but the increased generality should not make things more 
difficult. 


Theorem 15.3. Let K C L be fields and let p(x) be an irreducible polynomial of degree n in 
K[x]. If 0 € L is a root of p(x), let 


K(@) = {av +010 + 0267 + ee +an-10"'|a0, 1, -+-,An—] € K}. 
Then K(6) is a field extension of K such that {1,0,07,...,0"~'} is a basis and [K(0) : K] =n. 


Proof: One way to think K(@) is as the set consisting of 0 as well as all elements of L that can 
be obtained by plugging 6 into an element of K[x] of degree less than n. Certainly, K(@) is the 
span over K of the finite set {1,6,..., 6"-}) In light of Lemma 15.2, to show that K(6) is a 
field, it suffices to show that the product of any two elements of our spanning set belongs to 
K(6). The product of any two elements of {1, 6,...,0”~'} is of the form 6”, for some m > 0, 
and this leads us to an examination of the element x” € K[x]. 


The Division Algorithm in K[x] (Theorem 12.6) tells us that there exist g(x), r(x) € K[x] such 
that 


x!” = q(x): p(x) +r(x) with deg(r(x)) <n or r(x) =0. 


578 Chapter 15 


If we replace x by 6 then, inside the field L, we have 
a” = q@)- p(®) +r(6) = q@) -0+7r@) =r). 


Since either r(x) has degree less than n or r(x) = 0, it follows that r(6) € K(@). Thus, 
6” = r(6) € K(@) and K(6) is indeed a field. 


If we look back at the previous paragraph, we can see that the division algorithm actually 
shows that K(@) as the set of all elements of L obtained by plugging @ into elements of K[x]. 
Until now, it has not been necessary to assume that p(x) is irreducible. However, we will need 
this assumption to show that [K(@) : K] =n. Since the set {1,6,...,0”~'} spans K(6) over K, 
it only remains to show that this set is linearly independent over K. To this end, suppose 

a; € K such that 


(1) a +010 +0267 +--+ +0ln 10" | =0. 
We need to show that a; = 0, for 0 <i <n—1. Let 
r(x) = ag +aryx + ax? +++» + ay_1x""! € K[x]; 


if r(x) 4 0, then it follows from (1) that r(x) is an element of K[x] of degree smaller than n 
that has 6 as a root. However, since p(x) is irreducible and has degree n, Lemma 12.9 asserts 
that there are no nonzero polynomials in K[x] of degree less than n that have @ as a root. 
Thus, r(x) = 0. As a result, all the coefficients of r(x) are 0, so every a; = 0. Thus, the set 
{1,0,...,0"—'} is indeed linearly independent over K. 


To this point, we have been defining objects such as Q(/3), Q(i), Q(73), and Q(./2+ /5) on 
a case by case basis. However, Theorem 15.3 leads us to a single definition that describes all of 
these objects. If K C L are fields and a € L is a root of some irreducible p(x) € K[x] of 
degree n, then K(6) is the field 


{a + 010 + a6" + : Mb aya “lw, Oly, ...,Qy-1 € K} 


m Examples 


1. Since V3 is a root of x2 —3 and x? —3 is irreducible in Q[x] and has degree 2, 
Theorem 15.3 immediately asserts that the field Q(/3) has basis {1, 3} over Q and 
[Q(V3) : Q] =2. 


2. Since iis a root of x*+1 and this polynomial is irreducible and of degree 2 in Q[x], 
it follows that Q(i) has basis {1, i} over Q and [Q(i) : Q] = 2. Similarly, x7 +1 is 
irreducible and of degree 2 in R[x], so R(i) has basis {1, i} over R and [R(Z) : R] = 2. 
On the other hand, x* + 1 is not irreducible in C[x]. In fact, the monic, irreducible 


Degrees and Galois Groups of Field Extensions 579 


polynomial in C[x] that has i as a root is x —i. Since this polynomial has degree 1, 
[C@) : C] = 1. Observe that this is simply another way of saying that C(i) = C. 


The number 73 is a root of x? —7 which is irreducible and of degree 3 in Q[x]. 
Therefore, the field Q(73) has basis {1, 73,73} over Q and [Q(73) :Q) =3. In 
particular, it is impossible to write 73 asa linear combination of the other two basis 
elements. Thus, Theorem 15.3 immediately allows us to say that 73 cannot be 
written in the form a+ p73, where a, § € Q. Recall that this was an issue that we did 
not resolve when it first arose in the third example following Lemma 15.2. 


Earlier in this chapter, we saw that the number 2+ ¥/5 is a root of the polynomial 
x* —14x2+9 © Q[x]. Therefore, Q(./2+ V5) is a field and [Q(./2+ V5) : Q] <4. 
However, to determine if 4 is indeed the degree of this extension, we need to check if 
x4 — 14x? +9 is the minimum polynomial for /2 + /5 over Q. Remember that this is 
the same as determining if x* — 14x? +9 is irreducible in Q[x]. 


If we look back at the calculations which produced the polynomial x4 — 14x? +9 
and replace /2+ /5 by /2—J/5, we can see that /2 — V5 is also a root of 

x*— 14x7+9, Furthermore, since x4 — 14x? +9 only involves even powers of x, the 
additive inverses of these two roots must also be roots of x* — 14x” +9. Thus 

JV2+ 75, /2—J5, -V2—JS5, —J/2+ 5 are the four roots of xt — 14x? +9. Since 
we now know the four roots of x+ — 14x7+9 in R and C, it must factor in both R[x] 
and C[x] as, 


a) (v-(vi+48)) («-(Vi-¥9) (.-(-V5-s8) (x (-vi+v3)), 


If x+ — 14x? +9 were reducible in Q[x], then it would either have a linear factor or be 
a product of two irreducible quadratics. However, the rational root test shows that 
x4 — 14x +9 has no rational roots, so it has no linear factors in Q[x]. The Unique 
Factorization Theorem (Theorem 12.4) asserts that x — 14x? +9 factors uniquely in 
both R[x] and C[x]. Therefore, if x+ — 14x7+9 werea product of two irreducible 
quadratics in Q[x], then after factoring these quadratics in R[x] and C[x], we would 
obtain the identical factorization as in equation (2). Therefore, if x — 14x7 +9 was a 
product of two quadratics in Q[x], we could then regroup the four linear factors in 
equation (2) so that they could be multiplied to produce two quadratic factors in 
Q[x]. However, you should check that whenever we multiply x — (/2 + V5) by any of 
the three other linear factors, the quadratic polynomial we obtain contains a 
coefficient which does not belong to Q. Thus, x* — 14x” +9 is irreducible in Q[x] and 


so, [(Q(/2+ V5): Q = 4. 


www.elsevierdirect.com 


580 Chapter 15 


5. Earlier, we examined Q(cis()) and saw that it was a field whose dimension over Q 
was at most 10 because {(cis($))/|0 < j <9} was a spanning set. To determine the 
exact degree of this extension, we need to find the minimum polynomial in Q[x] 
satisfied by cis(Z). Certainly cis(Z) is a root ore” = 1 arid since x = Tiga 
difference of squares, we have 


x _j1= (x° — 1) (x° +1). 


Since cis (2) is a root of x» + 1, we now know that [Q(cis(Z)) : Q] < 5. However, —1 


is a root of + 1 and we have 
e+1l=(x4+1) (xt x8 +.x°-—x4+1). 


Therefore, cis (Z) isa rootera —2 44° —a-+1, so [Q(cis(Z)) : Q] < 4. We now 


need to determine if x4 — x7 + x* —x-+ 1 is irreducible over Q. 
Suppose p(x), g(x) € Q[x] such that 


=e te a—xele= P(x): q(x). 


Observe that by replacing x by —x in the previous equation, we obtain 
xtxete?txt1= P(—x)- q(—x). 


However, in the last example at the end of Section 9.4, we used a variation of 
Eisenstein’s Criterion to show that x*+ x3 +2%7+x+ 1 is irreducible in Q[x]. As a 
result, either p(—x) or q(—x) has degree 4. It easily follows that either p(x) or q(x) 


4 


also has degree 4. Therefore, x+ — x3 + x? —x+1 cannot be written as a product of 


polynomials of smaller degree in Q[x], so xt — x3 +x? —x+1 is irreducible in Q[x]. 


These examples indicate that determining the degrees of some extensions of the form Q(@) can 
be handled in seconds using Theorem 15.3, yet others require not only Theorem 15.3 but also a 
series of computations. When dealing with field extensions and other topics in abstract 
algebra, we can often come up with very quick answers to questions by invoking powerful 
theorems. This is quite satisfying and yields short and elegant solutions. However, there will 
always be problems where fancy algebraic machinery will take us only so far, and to complete 
these problems, we will need to roll up our sleeves and do lots and lots of old-fashioned paper 
and pencil computations. 


We now fully understand exactly which fields are represented by the symbols 


(v3), OW, RO, O(73), O(v2+V5), Q(cis(Z)). 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 581 


We can now turn attention to determining what we mean by the symbols 
Q(V3.i), Q(v2, v3, V5), Q(7%, cis (Z)). 
The fields 
Q(v2), @(24), @(28), @(2m), R 


all have the property that they lie between Q and C and also contain /2. However, any field 
that lies between Q and C and contains ./2 must contain every element of the form a+ b/2, 
where a, b € Q. Therefore, Q(V2) is clearly the smallest field that lies between Q and C and 
contains V2. 


More generally, if K C L are fields and 6 € L is the root of some nonzero element of K[x], 
then any field that lies between K and L and also contains 9, must contain 6”, for alln € N. 
Therefore, such a field contains K as well as the entire spanning set for K(@) over K. As a 
result, although there are likely many fields that lie between K and L and contain 6, every one 
of them must contain K(@). Therefore, K(@) is the smallest field that lies between K and L and 
also contains 0. 


Using the previous paragraph as a guide, we can look at Q(V3, i). It should represent the 
smallest field that lies between Q and C and contains both /3 and i. If we want to view things 
abstractly, we could simply say that Q(/3, i) is the intersection of all fields that lie between Q 
and C and contain both /3 and i. We would then need to check that this intersection is indeed 
a field in its own right. Although this is a legitimate way to view this problem, to a certain 
degree, it is quite unsatisfying. Not only do we want to know that a smallest field exists that 
lies between Q and C and contains both 3 and i, but we would also like to explicitly describe 
its elements in terms of Q, J/3, and i. 


To this end, let V be the span over Q of all elements of the form (J3)"i™, where n,m > 0. 
Observe that not only is V a vector space over Q, but it is also closed under multiplication. In 
addition, since V3 is a root of x? — 3, (/3)” is always equal to an element in the span over Q 
of the set {1, 3}. Similarly, i is a root of x? + 1, so i’” is always equal to an element in the 
span over Q of the set {1, i}. As a result, the set V is equal the span over Q of the finite set 
{1, af 33-4, J3i}. By Lemma 15.2, we know that V is a field. Observe that every field that lies 
between Q and C and also contains /3 and i must contain V, so V is the smallest field with 
these properties. Furthermore, we can describe every element of V in terms of Q, /3, and i. 
Note that the only issue we have not as yet settled is whether the degree of V over Q is 4 or 
something smaller. This set V is what we mean by the symbol Q(V3, i). We are now ina 
position to generalize this example. 


582 Chapter 15 


Theorem 15.4. Let K C L be fields and let 0), 03, ..., 0, € L such that each 6; is the root of 
some nonzero p;(x) € K[x] of degree m; in K[x]. If we let 


pes Ine YL 0 7 MA PMU JO ae685 


then K(0,, 02,...,9,) is the smallest field that lies between K and L and also contains each 6;. 
Furthermore, the degree of K(6,, 02,...,9,) over K is at most m,-m2---My. 


Proof. Let W denote the span over K of all elements of Z of the form 6,! f 6,2 oo On!” where no 
restriction is placed on how large the various /; can be. Since our spanning set for W is closed 
under multiplication, it is clear that W is closed under multiplication. Also observe that 

K (61, 62, ...,6,) is the span over K of all elements of the form 6, /10)/2.--6,J", where each Ji 
is bounded by m;. Therefore, K(6), 62, ...,6,) is a subset of W and has a spanning set with at 
most m,-m2---m, distinct elements, so the dimension of K(6), 02,...,0,) over K is at most 
m,-m---my. Also observe that K (61, 02,...,0,) must be contained in any field that lies 
between K and L and also contains each 6;. Therefore, it suffices to show that K(0,, 
62,...,9,) is a field. 


Since K(61, 62, ...,@,) is finite dimensional over K and W is closed under multiplication, 
Lemma 15.2 asserts that to conclude the proof, we only need to show that K(61, 

6, ...,0,) = W. Therefore, we need to show that every element of the form 6,'15!2..-8,!n 
belongs to K(61, 62, ...,6,), even if no restrictions are placed on the /;. We now make use of 
some ideas that appeared in the proof of Theorem 15.3. For each /;, we will apply the division 
algorithm in K [x] and divide xii by p;(x) to obtain 


x! = ai(x)- pi(x) +1;(x), 


where q;(x), r;(x) € K[x] and either r;(x) = 0 or has degree less than m;. Substituting 6; into 
the preceding polynomials, we can see that in L we have 


6; = qi()- pi(O;) + 17:6;) = Gi(8;) -0 + ri(6;) = 7;(6)). 
Therefore, in the expression 6,'1652---6,!", we can replace each Gli by 7;(6;) to obtain 
01105"? «+ On!" = 1 (0172 (02) «Tn (On): 


Each r;(@) belongs to the span over K of the set {1, 0;, 6;7,..., Mi ma Therefore, after 
repeated applications of the distributive law, the product rj (0; )r2(62) ---7,(@,) belongs to the 
span over K of elements of L of the form 61 /10)22.--6,J", where each Jj, is less than m;. Thus, 


01105!2 ..- Oy" = 11 (01)r2(2) «++ Tn (On) € K(A1, 02, «.+5 On): 


Hence, W = K(61, 62, ..., 0), thereby concluding the proof. 


Degrees and Galois Groups of Field Extensions 583 


Let us now take a look at the field Q(v3, i, /2, 75, 117). Theorem 15.4 tells us what a 
spanning set over Q for this field looks like. It then becomes clear that this field is the same as 
the fields 


Qi Ao args. 17), Q(1?, v3, i, V2,75), 
Q(v3, cas. 114, V2), and Q (V3, 173, /2, 117), 


More generally, when we look at the field K(6), 62, ...,9,), no matter how we permute the 
order of the 6;, we continue to obtain the same field. This holds because multiplication in L is 
commutative, so Theorem 15.4 continues to produce the same spanning sets regardless of the 
order of the 6;. 


We can now look at another field similar to Qv3, i, V2, 75, 117) and examine Q(v3, J2) 
(i) (75, 117). Observe that this field is built in steps. First, starting with Q, we use 

Theorem 15.4 to construct the smallest field contained in C, which also contains /3 and ad. 
thereby obtaining Q(V3, V2). Next, starting with Q(V3, V2), we use Theorems 15.3 or 15.4 
to construct the smallest field contained in C and also contains i to give us Q(¥3, V2)(i). 
Finally, we begin with Q(/3, /2)(i) and apply Theorem 15.4 to construct the smallest field 
contained in C, which also contains 75 and 117 to obtain the field QV/3 ; /2) (i) (75: 117). As 
we use Theorem 15.3 and 15.4 to construct Q3, J2) (i) (75, 117), we can see that it has the 
same spanning set over Q as does QV, i, V2, 7, 11 7), Thus, Q(V3, /2)(i)(75, 117) = 
OC/ 3.4, 4/2, 75,1175, 


More generally, if K C L are fields and 6),..., 65 € L are all roots of nonzero elements of 
K[x], then Theorems 15.3 and 15.4 tell us that the fields K(01, 62, 63, 04, 85) and K(@, 02) 
(63) (64, 85) are the same because they have the same spanning set over K. However, 

there is a problem that arises that can be handled by extending a concept introduced in 
Chapter 11. 


Definition 15.5. Let K C L be fields. An element 0 € L is said to be algebraic over K if it is 
the root of some nonzero p(x) € K[x]. 


Recall, in Chapter 11, we said that an element of R was algebraic if it satisfied some nonzero 
p(x) € Q[x]. Using the terminology of Definition 15.5, Theorem 15.4 tells us that the 
construction of the field K(6,,...,6,) requires that each 6; € L is algebraic over K. We just 
remarked that the fields K (61, 62, 63, 04, 05) and K (01, 02)(63)(64, 05) are the same provided 
each 6; is algebraic over K. However, if we look carefully at the construction of K (61, 02) 

(63) (64, 85) with Theorems 15.3 and 15.4, we need that 6), 62 are algebraic over K, 03 is 
algebraic over K (61, 62), and 64, @5 are algebraic over K (61, 62)(63). On the surface, it appears 


584 Chapter 15 


that 63 could be algebraic over K (61, 02) without being algebraic over K. Similarly, 64, 05 
could be algebraic over K (6), 62)(63) without being algebraic over K. This raises the question 
whether it is possible to select 6; € L such that K(0,, 62)(63)(@4, 05) can be constructed with 
Theorem 15.4 but K (61, 62, 63, 04, 65) cannot. This question will be answered by Theorem 15.9. 
However, we first need to prove a result that will not only enable us to prove Theorem 15.9 but 
also to compute degrees of field extensions such as [(Q(./3 EAD, 73, 117) : QI. 


Theorem 15.6. Let F C K CL be fields. If [L: K]| and [K : F] are both finite, then 
[L: FF] =[L: K]-[K: F]. 


Proof. Since L is a finite dimensional vector space over K, let v1, v2,..., Un € L be a basis 
for L over K. Similarly, K is a finite dimensional vector space over F and we can let 

W2, W2,..., Wm € K bea basis for K over F’. Next, consider all elements of L of the form 
wjv;, where | <i <m,1< j <n. To complete the proof, it suffices to show that these nm 
elements of L are indeed linear independent and a spanning set for L over F. 


To this end, let v € L. Since the v; span L over K, there exist 6; € K such that 


v= Biv, + Bova +--+ Ban. 
On the other hand, since 6; € K, for all j, there exist aj; <¢ F such that 
Bj =O jwi +agjw2+-+++AmjWm. 
Combining the previous two equations, we have 


v= Biv, + Bov2 +--+ Bain = (011. Wy +0121] W2 +++» + Om Wm) i+ 
(12W1 +a22W2 ++ +++ Om2Wm)V2 ++ +++ (@inW1 + O27, W2 ++ +++ OmnWm)Un = 


se jj(Wjv;). 


i<m; j<n 


Thus, the elements of the form w;v; do span L over F. 


Next, suppose @;; € F such that Seere. a;;(wjv;) = 0. To show that the w;v; are linearly 
independent, we need to show that each a;; = 0. Observe that 


0= ¥ aij (Wiv;) = (Nem) vt (Seon) vateee+ (Deum) Un- 


i<m;j<n i<m i<m i<m 


Degrees and Galois Groups of Field Extensions 585 


The v; are linearly independent over K and, for all j, ));—,, ijwi € K. Therefore, for all j, 


i<m 
1 ;W1 +O2j;W2 +--+ +AmjWm = 0. 


However, the w; are linearly independent over F and every a;; ¢ F. Therefore, for all i, 7, we 
have a;; = 0. Thus, the nm elements w;v; are both linearly independent and span L over F. 
Hence, [L: F] =nm=[L: K][K: F], as desired. 


Theorem 15.6 will also be the key piece of the puzzle, in the next chapter, when we prove that 
60° angles cannot be constructed. At this point, it is natural to wonder if, when given fields 

K CL, how many fields EF are there such that K C E C L. We will now show that if [L : K] is 
prime, then no field can lie properly between K and L. 


Corollary 15.7. Let K C L be fields. 
(a) If(L: K)=1, then L= K. 


(b) Jf[L: K]=p, where p is prime, and E is a field such that K C E C L, then either E= L 
orE=K. 


Proof. Part (a) should come as little surprise as the concept of dimension reflects the relative 
size of a vector space over a field. Therefore, if [ZL : K] = 1, we would intuitively think of 

L as being the same size as K, so L and K should be equal. For a formal proof, let 6 be 

any nonzero element of K. Since [L : K] = 1, the set {6} is subset of L with is linearly 
independent over K and has the same size as the dimension of L over K. By Lemma 14.12, 
{B} is also a spanning set for L over K. Hence, if / € L, there exists a € K such that a- 6 =1. 
Since both @ and 6 belong to K, so does /. Thus, L = K, as desired. 


For part (b), we begin by looking at E as a vector space over F. Since E is a subspace of L, 
Proposition 14.15 tells us that [E : F] is finite. Next, we look at L as a vector space over E. 

Any basis for L over F is certainly a spanning set for L over E. By Lemma 14.13, [L: E] is 
also finite. Therefore, we can now apply Theorem 15.6 to assert that 


p=(L: K)]=[L: E]-[E: F]. 


Since p is prime, either [L : E] = 1 or [E : K] = 1. However, part (a) now tells us that either 
E=Lor E= K, thereby concluding the proof. 


Observe that Corollary 15.7(b) immediately tells us that there are no fields that lie properly 
between R and C. Similarly, there are no fields that lie properly between Q and Q(/2), nor 


586 Chapter 15 


: 1 ; 
any that lie properly between Q and Q(673). We can now use Theorem 15.6 to reexamine 
some of the field extensions introduced earlier in this section. 


m Examples 


1. Let us compute [Q(V73, i) : Q]. Since Q(/3, i) = Q(V3)(i), we can look at this 


problem as 
Qe Q(V3) < Q(V3)@. 
Theorem 15.6 now asserts that 
[ o(v3, i) :Q] = [o(v3)@: Q] =[@(v3)@ : (v3) | [O(v3) : Q]. 


Since V3 is a root of x2 —3 and x? — 3 is irreducible in Q[x], [Q(V/3) : Q] = 2. Note 
that Q(/3) CR, therefore i ¢ Q(V3). Therefore, i is a root of x*+1 and x7+1 is 
irreducible in Q(/3)[x]. Thus, [Q(V/3) (i) : Q(V/3)] = 2. As a result, (Q(V3, i): Q] = 
2-2=4. 


At this point, one could ask if we could have also looked at this problem as 
8) [Q(v3,i):@] =[@@(V3) 0] =[@@(v3) AH] [OW :Q]. 


If we took this approach, we would immediately obtain that [Q(i) : Q] = 2. However, 
it takes some work to show that /3 ¢ Q(i). Therefore, the first approach is the easier 
one. On the other hand, note that since the first approach told us that [(Q(V3, i): 
Q] = 4, we could combine that with equation (3) to see that [Q()(V73) : Q@)] = 2. 
Therefore, we can use Theorem 15.6 to quickly establish that J3 ¢ Q(i). 


2. We will now compute [Q(73, cis(Z)) : Q]. Two approaches are to consider this 
problem as either 


oras 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 587 


As we will see, the easiest solution to this problem combines both approaches. Since 
4 : : 

73 isa root of x*—7 and cis(Z) is a root of xt — x3-+x?—x-+1 and both of these 

polynomials are irreducible in Q[x], we know that 


[o(7') :@] =3 and [O(cis(Z)) 0] =4. 


Furthermore, x? —7 might not be irreducible in Q(cis())[x] and x* —2x3+x2 -—x+1 
& 5 


might not be irreducible in Q(73)[x], so we can also say that 


© 
— 
Q 
ee 
n 
—_~ 
ala 
—” 
—” 
as 
~— 
bol 


) : Q(cis(Z)) | <3and [Q(7') (cis(Z)) :Q(75)] <4. 


Summarizing, we know the following: 


(a) [Q(73, cis(#)) : Q] < 12. 
(b) [Q(73, cis(Z)) : QJ is a multiple of 4. 


(c) [Q(73, cis(2)) : QJ is a multiple of 3. 

Since [Q(73, cis(Z)) : Q] is divisible by the relatively prime integers 3 and 4, it must 
be divisible by their product, 12 = 3-4. Combined with the fact that [Q73, 

cis(Z)) : Q| < 12, we can now conclude that [Q(73, cis(2)) : Q] = 12. Observe that 
at this point, we could now use Theorem 15.6 to assert that x° —7 is irreducible in 

Qicis(F)) Lx] and x*—x°+.x?—x+1 is irreducible in Q(73)[x]. 


We now turn our attention to [(Q(v3, jaf? 75, 117) : Q]. Solving this problem will 
require applying Theorem 15.6 several times. First we will consider 


[acv3, i, /2) :Q], [o(73) -Q], and [o(117) -Q]. 


We know that [Q(/3) : Q] = 2. You should do the computations necessary to 
convince yourself that fd a Q(73). Combined with the fact that /2 is a root of 
x? —2, it follows that [Q(/3)(/2) : Q(V/3)] = 2. We can now use Theorem 15.6 to 
see that [Q(/3, V2) : Q] = 4. Note that Q(/3, /2) CR, soi ¢ Q(V73, V2). When 
we combine this with the fact that i is a root of x? + 1, we see that [Q(/3, V2)(i) : 
Q(V3, V2)] = 2. Theorem 15.6 now tells us that [Q(V3, i, V2) :Q] = 8. 


Observe that 75 is a root of x® —7 and 117 is a root of x’ — 11, and both of these 
polynomials are irreducible in Q[x] using Eisenstein’s Criterion. Therefore, 


[Q(75) : Q] =5 and [Q(117) :Q)=7. 


www.elsevierdirect.com 


588 Chapter 15 


Theorem 15.6 tells us that [(Q(.v3, i, /2, 7, 117) : Q] must be divisible by 

[Q(V3, i, V2: Q], [Q(73) : QJ, and [Q(117) : QJ]. In light of our preceding work, 
[(Q(V3, i, V2, te 117) : QJ is divisible by 8, 5, and 7. Since any pair of these numbers is 
relatively prime, it follows that [(Q(V3, i, V2, qs. 117) : Q] is divisible by the product 


8.5-7 = 280. 
Now let’s take a slightly different approach. Using Theorem 15.6, we have 
[O(v3,i, V2.7, 111) :Q] = 
[O(V3, i, V2, 75)(117) : O(V3, i, V2, 74) | [O(V3, , V2) (74) : O(V3.i, V2) 
[acv3, i, /2) Q]. 


Since 117 is a root of x’ — 11 and x’ — 11 may or may not be irreducible in 
y y 


Q(/3, i, V2, 75)[x], we know that 
[av3, i, V2,73)(117) : Q(V3, i, V2,75)| 29: 


= i ; ae 
Similarly, 75 is a root of x° —7 and x° —7 may or may not be irreducible in 


Q(V3, i, V2)[x], so 
[o(v3.i, v2) (78) :@(v3,i,¥3)] <5. 


In addition, we already showed that [(Q(v3, i, J2) : Q] = 8. Therefore, Theorem 15.6 
now tells us that 


o(v3, i, v2,75, 114) :Q| = 
[o( v3, i, 2,74) (114) : (V3, 4, 2,74) | [O(V3, 4, ¥2)(74) : (V3.4, V2)] 


| o(v3, i, v2) -Q| < 7-5-8 = 280. 


Our preceding work now tells us that not only is [Q(/3, i «fZ, 75, 117) : Q] < 280, but 
it is also divisible by 280. Combining these facts, we now know that [(Q(/3, jaf? a, 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 589 


1): Q] = 280. Once again, we could. now go back and use Theorem 15.6 to show that 
x!’ —11 is irreducible in Q(V73, i, V2, 2,73 | and x° —7 is irreducible in Q(/3, i, /2)[x] 
# 


If K C L are fields such that [L : K] is finite, it follows that every 0 € L is algebraic over K. 
To see this, suppose [L : K] =n and consider the elements 1, 9, ..., 0”. Observe that this set 
must be linearly dependent over K, so there exist a; € K such that 


ayo +ta;O+---+a,6" = 0, 
with at least one a; 4 0. If we let 
P(x) =Aotayxt+---+ayx", 


then p(x) is anonzero element of K[x] that has 6 as a root. Therefore, 6 is algebraic over K. 
This motivates 


Definition 15.8. If K C L are fields, we say that L is algebraic over K if every element of L 
is algebraic over K. 


We saw that if [L : K] is finite, then L is algebraic over K. However, the converse does not 
hold. For example, consider the field L = Q(22 ; 23 ; oe : 25, ...). Then L is the smallest field 
that lies between Q and C and contains On, for every n € N. If 6 € L, then 6 can be written 
using only a finite number of roots of 2. Therefore, there exists some m € N such that 

6 € Q(22,23,..., 2m). Using Theorem 15.4, it follows that [Q(22,23,...,2™) : Q] is finite. 
Since Q(6) is a subfield of Q(22 : 23, ay 7); it must also be a finite extension Q. Thus, 6 is 
algebraic over Q and so, L is algebraic over Q. 


If [L : Q] were finite, then [L : Q] =¢, for some t € N. However, L contains the field Q(2 1), 
Observe that 27+7 is a root of the polynomial x'*+! — 2, and, by Eisenstein’s Criterion, this 
polynomial is irreducible in Q[x]. Therefore, Theorem 15.3 asserts that [Q(2#7) :QJ=r-+1. 
Thus, Q(2#) is a subfield of L whose dimension over Q exceeds that of L, a contradiction. 
As aresult, L is an algebraic extension of Q that is not a finite extension of Q. However, even 
though algebraic extensions need not be finite extensions, there is a result on algebraic 
extensions that is somewhat analogous to Theorem 15.6. 


Theorem 15.9. Let F C K CL be fields such that L is algebraic over K and K is algebraic 
over F. Then L is algebraic over F. 


Proof: Let 0 € L; since L is algebraic over K, there exists some nonzero 


P(x) =a tayx+---+ay,x" € K[x] 


590 Chapter 15 


such that p(@) = 0. Since K is algebraic over F’,, each a; is algebraic over F’. Therefore, by 
Theorem 15.4, we know that F(ao, @,...,Q,) is a finite dimensional field extension of F. 
However, by Theorem 15.6, we have 


[F(ao,...,Qn,9): F] =[F(ao,..., Qn, 9): Flag, @1,..., An) J[F(Q0, @1,...,Qn): F]. 


Observe that p(x) € F(a, @1,..., @,)[x] and @ is a root of p(x), so Theorem 15.3 tells us that 
[F(ao,.-., Qn, 9): F(a@o, 1, ...,@,)] is finite. Since we already know that [F(ao, a,..., 

a,) : F] is finite, the preceding equation tells us that [F(a@o,...,@n, 0): FJ is finite. Since 6 is 
contained in a finite extension of F, 0 is algebraic over F’. Thus, every element of L is 
algebraic over F, as desired. 


Observe that Theorem 15.9 immediately resolves the problem we faced before Definition 15.5 
regarding extensions of the form K (61, 02) (3) (64, @5). Indeed, if K(6), 02) is an algebraic 
extension of K and 63 is algebraic over K (61, 62), then Theorem 15.9 implies that 63 is 
algebraic over K. Similarly, if K(@,, 02)(@3) is algebraic over K and 64, 65 are algebraic over 
K (61, 62)(63), then Theorem 15.9 asserts that 64, 05 are also algebraic over K. Therefore, if 
K C Land 6; € L such that we can construct K (61, 02) (03) (64, 65) using Theorem 15.4, then 
each 6; is algebraic over K. As a result, we can also use Theorem 15.4 to construct 

K (61, 02, 63, 04, 05) and K(61, 02) (03) (64, 05) = K (61, 62, 63, 04, 05). A more general way to 
look at this is that when constructing fields that lie between K and L and contain various 

6; € L, Theorems 15.4 and 15.9 give us complete freedom regarding both the ordering and 
grouping of the 6;. 


Exercises for Section 15.1 


In exercises 1-8, we examine fields F where Q C F C O(V7, 53), 

1. Compute [Q(V7) : Q]. 

Compute [Q(5*) : Ql. 

Compute [Q(V7, 53) : QI. 

Compute [Q(V7, 53) : Q(V7)I. 

Compute [Q(V7, 53) : Q(53)]. 

If K is a field such that Q C K C Q(V7, 53), what are the only possible values for [K : Q]? 
If w € O(V7) and a ¢ Q, find [Q(V7) : Q(a)]. 

If B € Q(53) and B ¢Q, find [Q(5*) : Q(B)]. 


Coe Ge AON IY, an oP 


Degrees and Galois Groups of Field Extensions 591 


In exercises 9-27, we examine fields K where Q C K € Q(14!,315,/i). 
9. Compute [Q(14#) : QJ. 

10. Compute [Q(315) : QJ. 

11. Compute [Q(i) : Q]. 

12. Compute [Q(147, 315) : QJ. 

13. Compute [Q(144, i) : Q]. 

14. Compute [Q(315, i) : Q]. 

15. Compute [Q(147, 315, i) : Q). 

16. Compute [Q(147, 315) : Q(144)]. 

17. Compute [Q(144, i) : Q(142)]. 

18. Compute [Q(147, 315) :Q(315)]. 

19. Compute [Q(315, i) : Q(315)). 

20. Compute [Q(147, i) : Q(i)]. 

21. Compute [Q(315, i) : Q()]. 

22. Compute [Q(142, 315, i’) : Q(14?, 315)]. 

23. Compute [Q(144, 315, i) : Q(14!, a)]. 


24. Compute [Q(144, 315, i): QG15, a]. 


25. If K isa field such that QC K C Q(14;, 315, i), what are the only possible values for 
[K :Q]? 


26. Ifa €Q(315) anda ¢Q, find [Q(@) : Ql. 
27. Find Be Q(142) such that [Q(B) : Q] =2. 


28. If F, Ki, Ko,..., Ky, L are fields, such that F C K; C L, for all i, show that 
K ={ )j_, Ki isa field such that F C K C L. 


In exercises 29-34, let a, b € C such that [Q(a) : Q] =m and [Q(b) : Q] =n, where m and n 
are relatively prime. 


29. Compute [Q(a, b) : QJ. 


592 Chapter 15 


30. Compute [Q(a, b) : Q(a)] and then check if this is equal to [Q(b) : Q]. 

31. Compute [Q(a, b) : Q(b)] and then check if this is equal to [Q(a) : Q]. 

32. If c € Q(a) and d € Q(b), show that [Q(c) : Q] and [Q(d) : Q] are relatively prime. 
33. Ifc € Q(a@) and d € Q(), show that [Q(c, d) : Q] = (Qe) : QI) -[Q@ : QI. 

34. Show that Q(a)NQ(b) = Q. 


In exercises 35-44, we examine fields L where QC LC Q(26 ; 210), If a, 6 € C are algebraic 
over Q, then [Q(@, B) : Q(@)] < [Q(B) : Q] and [Qt@, B) : Q(B)] < [Q(@) : Q]. In many of the 


preceding exercises, equality occurred. However, as we shall see in the following exercises, 
equality does not always occur. 


35. Show Q(26, 27) = Q(2). 

36. Compute [Q(25, 210) : QJ]. If necessary, use exercise 35. 
37. Compute [Q(25) : QI. 

38. Compute [Q(275) : Q]. 


39. Compute [Q(26, 210) : Q(26)] and check if this is equal to [Q(21) : Q]. 
40. Find the minimum polynomial for 210 over Q and over Q(2). 

41. Compute [Q(2*, 275) : Q(21)] and check if this is equal to [Q(22) : Q]. 
42. Find the minimum polynomial for 26 over Q and over Q(210). 

43. Compute [Q(2%, 275) : Q(22)]. 

44. Show that Q(26) 1 Q(20) = Q(22). 

In exercises 45-54, we examine fields E where QC EC Q(78 ; 710). 

45. Show Q(7%,73) = Q(7%). 

46. Compute [Q(75, 715) : QJ]. If necessary, use exercise 45. 

47. Compute [Q(75) : QI. 

48. Compute [Q(735) : QI. 


49. Compute [Q(75, 715) :Q(78)] and check if this is equal to [(Q(75) : Qj. 


50. 
51. 
52. 
53. 
54. 


Degrees and Galois Groups of Field Extensions 


Find the minimum polynomial for 7 15 over Q and over QU7% )s 


Compute [Q(76,715) : Q(73)] and check if this is equal to [Q(75) : QI). 


Find the minimum polynomial for 76 over Q and over Q(7is Js 
Compute [Q(7% ; 715) : Q(73)). 
Show that Q(76) NQ(715) = Q(73). 


In exercises 55-68, we examine fields K where Q C K C O(/2, J3, i). 


55. 
56. 
57. 
58. 
59. 
60. 
61. 
62. 
63. 
64. 
65. 
66. 
67. 
68. 
69. 


70. 


71. 
72. 


Compute [Q(V2) : Q]. 

Compute [Q(73) : QI. 

Compute [Q(Z) : Q]. 

Compute [Q(v2, V3) : Ql. 

Compute [Q(/2, i) : Q]. 

Compute [Q(73, i) : Q]. 

Compute [(Q(V/2, 73, i) : QI. 

Compute both [Q(/6) : Q] and [Q(V2, V3) : Q(V6)]. 
Compute both [Q(iV2) : Q] and [Q(V/2, i) : QGV2)]. 
Compute both [Q(i/3) : Q] and [Q(V/3, i) : QGV3)]. 
Compute both [Q(/2 + V3) : Q] and [Q(V2, V3) : QV24+ V3)]. 
Compute both [Q(./2 + Vi) : Q] and [Q(V2, i) : Q./2+i)]. 
Compute both [Q(/3 +) : Q] and [Q(V3, i) : Q(V3 +i)]. 
Compute [Q(/2+ /3+i) : Q]. 


If K C L are fields, show that a € L is algebraic over K if and only if a” is algebraic 


over K. 


Let K C L be fields where a € L and f(x) € K[x] has degree at least one. Ina 
generalization of exercise 69, show that a is algebraic over K if and only if f(a) is 


algebraic over K. 


If K C L are fields, show that the elements of L that are algebraic over K are a field. 


593 


If K C L are fields and a € L is algebraic over K and b € L is not algebraic over K, show 


that a+ b is not algebraic over K. 


594 Chapter 15 


15.2 Simple Extensions 


In the previous section, we began by looking at fields of the form Q(/3), Q(i), and Q(75) and 
then more complicated fields like Q(V3, i, 2, qs 11 7), It certainly appears the more 
elements we attach to a field, the more complicated things become. However, it turns out that 


Q(v3, i) =Q(v3+i) and Q(v2, v3) =Q(v2+ V3). 


Therefore, we can sometimes take a field that is generated by more than one element and 
simplify the situation by generating it by a single element. It is natural to ask, how often we 
can do this? To answer this, we begin with 


Definition 15.10. If K C L are fields with [L : K] finite, we say that L is a simple extension if 
there exists some 0 € L such that L = K(@). 


The goal of this section is to show that whenever Q C K CL C C are fields such that [L : K] 

is finite, then L is a simple extension of K. It is not hard to prove slightly more general results, 
but for the applications we have in mind, it suffices to look at fields that lie between Q and C. 

We first need a technical lemma about linear functions. 


Lemma 15.11. Let A, B be finite subsets of C and let a, B € C, where a ¢ A. Then there 
exists a nonzero integer m such that the linear function 


I(x) =m(x-—a)+ 8 
has the property that l(a) € B, for allae A. 
Proof. Since A and B are finite sets, there are only a finite number of elements of C that are of 
the form — where a € A and be B. Therefore, we can choose some nonzero m € Z that is 


not of this form. We claim that the function /(x) = m(x — a) + 6 has the desired property. By 
way of contradiction, suppose a € A such that /(a) = b € B. Then we have 


b=m(a—a)+f, 
and solving for m yields m = eee which contradicts our choice of m. Thus, /(x) has the 
desired property. 


We can now prove 


Theorem 15.12. LetQC K CL CC be such that [L : K] is finite. Then there exists some 
6 € L such that L = K(@). 


Proof. Since [L : K] is finite, there exists a finite set {0,,...,0,} that is the basis for L over K. 
In particular, this says that L = K(6),...,6,). Therefore, if [L : K] is finite, there always exists 


Degrees and Galois Groups of Field Extensions 595 


some n € N such that L = K(6),...,0,). Therefore, it will now suffice to use Mathematical 
Induction to show that whenever L = K(61,...,6,), there exists 9 € L such that L = K(6). 


The case where n = | is obvious, so we need to show that if the result holds for some k € N, 
then it also holds for k + 1. Therefore, let us consider the field L = K(@1,..., 0%, 0x41). 
Observe that if we let K; = K(61,..., 0%), then the fact that the result holds for k tells us that 
there exists a € K, C L such that K,; = K(q@). For convenience, we can let 6 = 6,4) and using 
this notation, we now have 


L=K(1,..., Ok, O41) = KO, ..-, %) (x41) = K(a)(B) = K@, B). 


Since a, B are algebraic over K, we can let f(x), g(x) be, respectively, the minimum 
polynomials for w and 6 in K[x]. Next, we can apply the Fundamental Theorem of Algebra to 
assert that both f(x) and g(x) can be factored into linear factors in C[x]. Now let A be all the 
roots of f(x) other than a and let B be all the roots of g(x). 


Using A, B, a, 6 from the previous paragraph, we can let /(x) = m(x — a) + B be the 

linear function described in Lemma 15.11. We can rewrite /(x) as mx + (6 — ma), so 

I(x) € K(B—ma)[x]. If we let h(x) = g(I(x)), then since f(x), g(x) € K[x] C K(6—ma)[x], 
we can see that the polynomials f(x) and h(x) both belong to the polynomial ring 
K(B—ma)[x]. 


Observe that 
h(a) = g(U(a@)) = g(m(a—a@) + B) = g(B) =9. 


Therefore, a is a root of both h(x) and f(x). On the other hand, suppose a is any root of f(x) 
other than a. Then, by our choice of /(x), /(a) is not a root of g(x). Hence, computing in C, 
we have 


h(a) = g(U(a)) £0. 


Thus, a is the only root that f(x) and h(x) have in common. 


Computing in the polynomial ring K(6 —ma)[x], we can let c(x) denote the greatest common 
divisor of f(x) and h(x). Recall that Lemma 12.18 tells us that c(x) is the greatest common 
divisor of f(x) and h(x), even if we work in the larger polynomial ring C[x]. Observe that the 
roots in C of c(x) are precisely the roots that f(x) and h(x) have in common. As a result, @ is 
the only root of c(x). Furthermore, since f(x) is irreducible in K[x], Corollary 12.23 tells us 
that f(x) does not have any multiple roots in C[x]. Since any multiple root of c(x) would also 
be a multiple root of f(x), it follows that @ is a root of c(x) of multiplicity one, so 


c(x) =x-ae K(B—ma)[x]. 


596 Chapter 15 


Every coefficient of c(x) must belong to K(B — ma), soa € K(B —ma), which immediately 
tells us that 


B=(B—ma)+m-ae K(B—ma). 
Since a, 6B € K(B —ma), we have 


K(B—ma@) = K(@, B) =L 


If we now let 6 = 6—ma, then L = K(6), as desired. 


The combination of Lemma 15.11 and Theorem 15.12 not only tells us that finite extensions 
K CL that lie between Q and C are simple extensions, it also explains how to find an 
appropriate 0 such that L = K(6). 


m Examples 


1. Consider Q(V3, i); we can begin by letting w= i and B = V3. According to the 


arguments in Lemma 15.11 and Theorem 15.12, we only need to avoid nonzero 
b-v3 


ai ? 
x” — 3. Since this restriction does not exclude any nonzero integers, it follows 


that Q(/3 — mi) = Q(V3, i), for every nonzero integer m. In particular, 
Q(V3 +1) = Q(V3, i). 


2. Let us now examine Q(V/2, V3); we can begin by taking a= J3 and B= V2. In this 


example, the nonzero integers we need to avoid are those of the form 2 ae where a 


intee= that are of the form where a is a root of x7+ 1 and b is a root of 


is a root of x? —3 and bis a root of x” — 2. Once again, this eencion, ‘dees not 
exclude any nonzero integers, thus Q(/2—mV/3) = Q(V2, V3), for every nonzero 
integer m. In particular, Q(/24+ V3) = Q(V2, V3). 


3. Fora more involved example, let us combine the previous examples and consider 
Q(v2, af 4 i). From our work in the previous example, we know that 


Q(V2, V3, i) = Q(V24+ V3, i). This time, let a =i and B= /2+ V3. The only 
b- baW2tva) 


nonzero integers we need to avoid are those of the form , where ais a root 
of x2 +1 and bis a root of the minimum polynomial for fis V3. 3. Observe that 
J/2++/3 is a root of x4 — 10x? + 1, and it follows from Theorem 9.15 that 

x* — 10x? + 1 is the minimum polynomial for /2.+4/3 in Q[x]. The four roots of 

x* — 10x?+ 1 are easily seen to be +/2+ V3. Therefore, when we are looking at 
objects of the form baw 2s 5) b is always a real number, and the only choice for 
ee) 


l 
ais —i. In particular, the nonzero values of =" are never real numbers. Thus, 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 597 


Q(v2, af, ij= Q(/2+ /3 — mi), for every nonzero integer m. In particular, 
Qv2, V3, i) = Q(/2+ V3 +i). 


4. The preceding examples might be somewhat misleading, as you might be tempted 
to believe that whenever Q(a, £) is a finite extension of Q, then Q(a, 8) = Q(a+ ). 
We now provide an example that shows that this is not the case. Consider 
Q(V72 +i, V3 — 1); if we let a = /2+i and B= V/3—i, thena+ B= V2+4 V3. 
However, Q(/2+ V3) C R, whereas Q(/2 +i, V3 —i) contains elements of C 
which do not belong to R. Thus, 


Q(v2+i, V3-1) £Q(V2+- V3). 


Observe that a+ 6 corresponds to the value of B —ma when m = —1 and if we 
examine the roots of the minimum polynomials for 2 +i, 3 —i, we can see that 
the only nonzero integer excluded by Lemma 15.11 and Theorem 15.12 ism =—1. 
Therefore, 


Q(V2+i, J/3- i) — Q((v3 —i) —m(V2+i)), 
for any integer m other than 0 and —1. In particular, if we let m = 1, we obtain 


Q(V2+i, V3 —i) = Q((v3—i) —(V2+))). 


Since Theorem 15.12 allows us to view various finite extensions as simple extensions, it 
greatly assists us in obtaining our first result comparing the size of a Galois group to the degree 
of a field extension. Recall that Galois groups were defined back in Definition 5.17. Also in 
Chapter 5, Corollary 5.13 generalized the familiar fact that roots of polynomials in R[x] occur 
in conjugate pairs in C. In the following proof, we will use Corollary 5.13 as it tells us that if 
g € Gal(L/K) and if 6 € L is the root of some p(x) € K[x], then g(@) is also a root of p(x). 


Theorem 15.13. LetQC K CLC C be fields such that [L : K] is finite. Then 
|Gal(L/K)| <[L: K]. 


Proof. If g € Gal(L/K), we need to see how g can behave on the various elements of L. Since 
[L : K]is finite, Theorem 15.12 asserts that L = K(6), for some 6 € L. Letn =[L: K]; 


therefore, if p(x) is the minimum polynomial in K[x] for 6 then, by Theorem 15.3, we know 
that p(x) has degree n. If v € L, Theorem 15.3 also tells us that 


v=ay+ajO+---+ on 16"', 


www.elsevierdirect.com 


598 Chapter 15 


where every a; € K. Since g is an automorphism that is the identity map on K, we have 


g(v) = g(a + ayO++ +» +On-16"~!) = g(ao) + 8(a10) +-*- + 8(An-10" |) = 
g(a) + g(a) (0) +-*- +. 9(an_1)9(0" |) =a ta g(0) +--+ +a,-19(0" |) = 
a +01 8(0) +++ +.0n_19(0)" 


As a result, if we knew g(@), then we would immediately know g(v). In other words, 

the behavior of g on 6 completely determines its behavior on every element of L. By 
Corollary 5.13, g(@) must also be a root of p(x). However, p(x) has at most m roots 61, ..., Om 
in L, where m <n. Therefore, the only possible values for g(@) are 61, ..., @,. At this point, it 
is not clear that each 0; actually results in an element of Gal(L/K). However, it is clear that 
there are at most m <n choices for g(@), hence 


|Gal(L/K)| <m<n=[L: K], 


as desired. 


m Examples 


1. Since [C: R] = 2, Theorem 15.13 tells us that Gal(C/IR) < 2. However, since we 
know that the identity map and complex conjugation are two different elements of 
|Gal(C/R)| <2, we know that 


|Gal(C/R)| =2=[C: R]. 


2. We know that [Q(V/2) : Q] = 2; therefore, Theorem 15.13 tells us that 
|Gal(Q(V2)/Q)| < 2. On the other hand, in Chapter 5 we showed that the identity 
map and the function defined as o(a+bV2) = a—bvV2, for all a,b € Q, are both 
elements of Gal(Q(V2)/Q. Combining these observations, we have 


|Gal(Q(V2)/Q)| = 2 = [Q(V2) : Q]. 


In the two preceding examples, we had |Gal(L/K)| =[L: K]. It turns out that 
when examining the relationship between groups, fields, and polynomials, this is the 
ideal situation. However, the following example, which we first saw in Chapter 5, 
indicates that this ideal situation does not always occur. 


3. The degree of Q(2?) over Q is 3. In addition, if g € Gal(Q(23)/Q), then g sends 23 
to a root of x? —2 that belongs to Q(23). However, the other two roots of x* —2 are 
not real numbers and Q(23) CR. Therefore, 23 is the only root of x? —3 that 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 599 


1 


belongs to Q(23). Hence (23) = 23 and g is the identity map on all of Q(23). Asa 
result, 


|Gal(@(2) /9)| si23< | o(24) ; Ql. 


4. Next we consider Q(V73, i); since [Q(V3, i) : Q] = 4, Theorem 15.13 tells us that 
|Gal(Q(V3, i)/Q)| < 4. In Chapter 5, we saw that there were four candidates 
21, 22, £3, 24 for elements of Gal(Q(V3, i)/Q), and they can each be described in 
terms of their behavior on V3 and i as follows: 


gi(V3)= V3, @=i, — g2(V3) = —V3, go(i) =i, 
93(V3) = V3, 93() =—i, ga(V3) =—V3, g4(i) = i. 


It turns out that each of the preceding four candidates does indeed yield different 


elements of Gal(Q(V73, i)/Q), so 
|Gar(Q(v3, i) /0)| =4 = [O(V3,4) : 9], 


In the last example, we stated but did not prove that 91, g2, g3, g4 were all elements of the 
Galois group. We omitted the proof because, at this point, the computations needed would be 
very messy. But it raises the question, are there some conditions on K and L that guarantee 
that |Gal(L/K)| =[L: K]? In particular, if we knew that the previous example satisfied these 
conditions, then we would automatically know that |Gal (Q(V3, i) /Q|=4= [Q(V3, i): 

Q] = 4. Thus, we would immediately know, without doing messy computations, that all four 
of the g; are elements of Gal (Q.V3 ,1)/Q). More generally, if we are trying to describe the 
elements of Gal(L/K), then our job becomes much easier if we already know the size of 
Gal(L/K). In light of this, the goal of the next section will be to find a natural condition that 
will indeed guarantee that |Gal(L/K)| =[L: K]. 


15.3 Splitting Fields and Their Galois Groups 

In the previous section, we saw that ifQ C K CL CC, with [L: K] finite, then 
|Gal(L/K)| <[L: K]. We then examined some examples where |Gal(L/K)| =[L: K] 
and others where |Gal(L/K)| < [L : K]. This raises the question whether there is some 
natural condition we can place on L that will guarantee that |Gal(L/K)| =[L: K]. 


www.elsevierdirect.com 


600 Chapter 15 


Recall 
|Gat(Q(v2)/Q)| =2 = [@(v2) : Q] 

and 

(4) |Gal(Q(2)/Q)| =1 <3 =[0(24) :Q]. 


We need to examine what it is about Q(./2) and Q(23) that caused the first part of (4) to be an 
equality and the second part to be an inequality. Observe that Q(./2) is obtained from Q by 
adding on one of the roots of the polynomial x? — 2 € Q[x]. However, it turns out that Q(/2) 
actually contains all the roots of x* — 2. On the other hand, Q(23) is obtained from Q by 
adding on one root of x? — 2, but it does not contain the other two roots of x? — 2. This is no 
coincidence, and it motivates 


Definition 15.14. Let QC K CC be fields and let f(x) € K[x]. If 1, ..., Om € C are the 
roots of f(x), we call the field K(@1,..., 0m) the splitting field for f(x) over K. 


Let us now briefly revisit the four examples from the end of the previous section. 


m Examples 


1. Cis an extension of R and is the splitting field over R of the polynomial x? +1 € R[x]. 


In fact, the quadratic formula tells us that if f(x) is any irreducible quadratic in R[x], 
then C is the splitting field for f(x) over R. Furthermore, the Fundamental Theorem 
of Algebra tells us that if g(x) € R[x], then the splitting field for g(x) over R is either 
IR or C depending on whether or not all the roots of g(x) are real. 


2. Q(¥2) is an extension of Q and is the splitting field over Q of x7 —2 € Q[x]. Using 
the quadratic formula, we can see that Q(V2) is the splitting field over Q of every 
polynomial of the form x” — 2ax + (a — 2b”), where a, b € Q and b £ 0. Therefore it 
is also the splitting field over Q of the polynomials x” — 2x —1, x? — 18, and 
x? — 8x — 34. 

3. The field Q(23) is not the splitting field of x? —2 over Q. If we let w = cis(22), then 
the three roots of x* —2 are 23, 23a, 230, Therefore, Q(23, 23, 23w) is the 
splitting field of x? — 2. Observe that this field is the same as the field Q(23, @). This 
is a larger field than Q(23) as Q(23) CR, yet Q(23, @) contains w, which is not an 
element of R. 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 601 


4. Q(¥V3, i) is the splitting field for (x? — 3)(x? +1) € Q[x] over Q. Observe 
that Q(v3, i) is also the splitting field of (x? —3)(x? + 1) over both Q(i) and 
Q(3). 


The last example illustrates the fact that if L is the splitting field over K of some f(x) € K[x], 
then, for any field E with K C E CL, L is also the splitting field for f(x) over E. Before 
proving that being a splitting field guarantees equality between the degree of the field 
extension and the size of the Galois group, we need two lemmas. 


Lemma 15.15. Let QC K CL CC be fields such that L is the splitting field of some 
f(x) € K[x]. If: L > Cis an injective homomorphism which is the identity map on K, then 
mL) = L. 


Proof. Let 01, ..., mn € C be the roots of f(x). Therefore, L = K(01,..., mn) and, by 
Theorem 15.4, L is a finite extension of K. Applying Theorem 15.12, there exists some 

a € L such that L = K(q@). Next, let p(x) € K[x] be the minimum polynomial for @ over K 
and let n be the degree of p(x). 


Since a € K(01,..., 9m), Theorem 15.4 asserts that w can be written as a linear combination 
over K of elements of the form 61/!0)/2 ---@,/", where each Jt = 0. We would like to apply 
Corollary 5.13 to z, but Corollary 5.13 is stated in terms of automorphisms, and we haven’t 
yet shown that z is an automorphism of L. However, if you look back at the proofs of 
Proposition 5.11 and Corollary 5.13, it is enough for z to be an injective homomorphism from 
L to C for every step in these proofs to apply to z. Therefore, by applying the conclusion of 
Corollary 5.13, we can conclude that z(6;) is also a root of f(x). In particular, this says that 
(0,) = 6,, for some s. Hence, m(6,) € L = K(q@). But this immediately implies that (a) is a 
linear combination over K of elements of the form 2(6,)/!7(62)/2 --- (8m), all of which 
belong to L = K(q). As a result, r(@) € K(q). 


Since z is a homomorphism that is the identity on K, we have 1(K(a)) = K(z(@)). When we 
combine this with the fact that z(@) € K(q@), we see that 


mL) = 1(K(a@)) = K(a(a)) C K@) = L. 


The set 27(L) is now a vector space over K which is contained in L. If we let 6 = z(q), then, 
since the conclusion of Corollary 5.13 applies to z, 6 is also a root of the polynomial p(x). As 
a result, [K(B) : K] and [K(q@) : K] are equal as they are both equal to the degree of p(x). 
Hence, 2(L) = m(K(a@)) = K(f) is a subspace of L = K(q@) yet has the same dimension over 
K as L does. Proposition 14.15 now asserts that m(L) = L. 


602 Chapter 15 


We continue with 


Lemma 15.16. Let QC K CC be fields and let p(x) € K[x] be irreducible of degree n. If 
6, 02 € C are roots of p(x), define the function m : K(0@,) > K(6) as 


IU (ao +00) +076)? a + n—10;""') = a9 +0162 + 0769" ++. 4g (05 |, 
where each a; € K. Then x is a bijective homomorphism of fields that is the identity map on K. 


Except for showing that the function z preserves multiplication, the proof of Lemma 15.16 is 
fairly straightforward. Before reading the proof, you should try to prove that is bijective and 
preserves addition. In the proof, we will repeatedly use the fact that for every element 

a é€ K(@,), there exists a unique polynomial in K[x], which we will denote as f,(x), such that 
a= fq(@) and either deg( fa(x)) <n or fa(x) = 0. Clearly, for every c € K(62), there also 
exists a unique polynomial f(x) € K[x] with the same properties such that c= f,(62). 
Representing elements of K(0,) and K(@2) as values of polynomials might appear to make 
parts of the proof of Lemma 15.16 more complicated than need be, but it will be good practice 
for the part of the proof where we show that z preserves multiplication. 


Proof of Lemma 15.16. By Theorem 15.3, for i= 1,2, K(6;) is a field extension of K with 
basis {1, 6;, 0;7,...,0;"—'}. If c € K(02), let f(x) € K[x] be the polynomial described above 
such that c = f,.(02). Observe that 


c= fe (2) = 1 (fe (1), 


hence zr is surjective. 


Next, if a,b € K(6,), let fa(x), fo(x) € K[x] be the polynomials just described such that 
a= f,(0,) and b= f,(9). If z(a) = z(b), then we have 


fa(92) = WH fa(A1)) = ma) = mb) = 1 fo(1)) = fo(O2). 


Therefore, 02 is aroot of f,(x) — fp(x). Observe that f,(x) — fp(x) either has degree less than 
the degree of p(x) or is equal to 0. Since p(x) has the smallest degree among all nonzero 
polynomials with 6 as a root, it follows that f(x) — f,(x) = 0. Thus, fa(x) = fp(x), which 
implies that 


a= fal) = fo(@1) = b. 


Thus, z is also injective, so is a bijection. It is also easy to see that if a € K, then the 
polynomial f,(x) € K[x] such that a = f,,(@)) is a constant polynomial. Therefore, (a) = a, 
so 7 is the identity on K. 


Degrees and Galois Groups of Field Extensions 603 


To show that zr preserves addition, let a, b € K(6,) and once again let f,(x), f,(x) € K[x] be 
the polynomials just described such that a = f,(6,) and b = f;,(6,). We now have 


ma+ b) = m( fal) + fo(A1)) = fa(O2) + fo(2) 
= M4 fa(A1)) +1 fo(A1)) = ma) + 1(0). 


Finally, to show that z preserves multiplication, let a, b, f(x), f,(x) be as in the previous 
paragraph. Applying the division algorithm in K[x], there exist g(x), r(x) € K[x], where 
deg(r(x)) <n or r(x) = 0, such that 


Salx)> fo(x) = q(x): p(x) +r). 
As a result 


m(ab) = W fa(1) fo(A1)) = m(q(@1) P(@1) +r) = A(7(A1)) = 
(62) = q(@2) p(O2) +r(62) = fa(O2) fo(O2) = m( fa(1))( fo(O1)) = m(a)r(b). 


Thus, z preserves multiplication, thereby concluding the proof. 


We can now prove the main result of this section. 


Theorem 15.17. Let QC K CL CC be such that L is a splitting field of some f(x) € K[x]. 
Then |Gal(L/K)| =[L: K}. 


Proof: By Theorem 15.12, L = K(q@), for some a € L. If we let p(x) be the minimum 
polynomial for w over K[x] and let n denote the degree of p(x), then Corollary 12.23 tells us 
that p(x) has n distinct roots in C that we can denote as f1,..., By. Since Theorem 15.3 tells 
us that [L : K] =n, it will suffice to show that there is exactly one element of Gal(L/K) 
corresponding to each §;. 


By Lemma 15.16, for each i <n, there is a bijective homomorphism from K(q@) to K(f;) that 
is the identity on K. In particular, 7; sends L to C, so Lemma 15.15 tells us that 7(L) = L. 
Thus, each 7; is indeed an element of Gal(L/K). On the other hand, all of the ;’s are 
different, as they give different values when we plug in a. Hence, Gal(L/K) has at least n 
different elements. But having already shown that |Gal(L/K)| < [L: K], we can now 
conclude that when L is a splitting field over K, we have |Gal(L/K)| =[L: K]. 


It is useful to consider what role Theorem 15.17 actually plays in helping us to determine the 
structure of Gal(L/K). In the proof of Theorem 15.17, we used the fact that L = K(q@) to 
determine the size of Gal(L/K) by looking at how the various zr; act on a. Although this is 
extremely helpful in computing the size of Gal(L/K), it is not particularly useful in 


604 Chapter 15 


determining the product of elements of Gal(L/K). For example, suppose 71, 72 are as in the 
proof of Theorem 15.17, and we wish to determine if 77; and 22 commute. Observe that 


71 72(@) = 7 (B2) and 7m 27\(@) = 72(B}). 


Unfortunately, without additional information about the field, there is no way of knowing if 
1 (Bz) and 7r2(f1) are equal. Thus, there are some limitations on how we can apply 

Theorem 15.17. But depending on the situation, using Theorem 15.17 to determine the size of 
Gal(L/K), can be very helpful. To see this, we revisit the last example from the previous 
section. 


m= Example 


When we tried to describe the elements of Gal(Q(V3, i)/Q), Corollary 5.13 indicated 
that there were four possible functions g1, g2, 83, 94 that could be elements of 
Gal(Q(V3, i)/Q). Since these four functions are the identity on Q, their behavior on 
Gal(Q(V3, i)/Q) is completely described by their action on /3 and i: 


gi(V3)=V3, =i (V3) =—V3, oO =i, 
g3(V3)=V3, g3@=—i, ga(V3) =—V3, ga) =i 


However, as we concluded the previous section, it was unclear which of these four 
functions actually belong to Gal(Q(V3, i)/Q). We could have checked that each one was 
indeed an automorphism of Gal(Q(V3, i)/Q), but that would be very tedious and time 
consuming. 


However, it is clear that Q(./3, i) is the splitting field over Q of the polynomial 
(x? — 3)(x?+ 1). Therefore, we can now use Theorem 15.17 to make quick work of this 
problem. Indeed, Theorem 15.17 tells us that 


|Gai(Q(Vv3, i)/0)| = [O(v3,i) :0] =4. 


Since |Gal(Q(V3, i)/Q)| = 4, it is clear that all four of g1, 99, 3, g4 are elements of 
Gal(Q(V3, i)/Q). 


Note that having expressed the four elements of Gal(Q(V3, i)/Q) in terms of their action 
of V3 and i, it is easy to check that Gal(Q(V3, i)/Q) is abelian and has the property that 
the square of each of its elements is the identity map. 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 605 


For the remainder of this section, we will look at additional applications of Theorem 15.17. 
But first we need the following application of Eisenstein’s Criterion. 


Lemma 15.18. Jf p is prime, then the polynomial x?~! +---+x-+1 is irreducible in Q{x]. 


Proof. Suppose 
XPT fe px $1 = (asx +--+ +0 +00) (Brx’ +--+ + Bix+ Bo) 


in Q[x]. We need to show that either s = 0 or t = 0. First, multiply both sides of the previous 
equation by x — 1 to obtain 


xP —1 = (ax* +--+ + ax +a) (Bx’ +--+ + Bixt Bo) (x— 1). 
Next, replace x by y+ 1 to give us 


(y+ LD? —1= (as(yt Do +--+ +01 (yt L) +00) (Bvt D' +--+ Bi(y+ 1) + Bo) &). 


After expanding out the (y+ 1)? term on the left-hand side and then dividing both sides by y, 

we have 

ee ee P(p—1) 
7 i 2 

(as(yt I) +--+ 0n(y +1) +00)(B(y + Di +--+ Bi(y +1) + Bo). 


HO py a, yt+p= 


If we expand all the terms on the right-hand side of the equation of the form (y+ 1)/, for some 
i, then the right-hand side is seen to be a product of polynomials in Q[y] of degrees s and t. 
However, every coefficient on the left-hand side, with the exception of the leading coefficient, 
is a multiple of the prime p. Furthermore, the constant term on the left-hand side is not a 
multiple of p. Therefore, we can apply Eisenstein’s Criterion to assert that the polynomial on 
the left-hand side is irreducible in Q[y]. Looking at the product on the right-hand side, it must 
now be the case that either s = 0 or t = 0, as desired. 


We can now examine the splitting field of x? — 1 over Q. 


Corollary 15.19. Let L be the splitting field over Q of x? —1, where p is prime. Then 
1. [L:Q])=|Gal(L/Q)| = p-1. 


2. o= cis(*), then Gal(L/Q) = {81,.--, &p—1}, where L = Q(@) and g;(w) = w!, for 
1<i<p-l. 


3. Gal(L/Q) is abelian. 


Proof: By DeMoivre’s Theorem, if w = cis(=), then the p roots of x? — 1 are of the form w', 
for 0 <i < p—1. Since all powers of wm belong to Q(a), it is clear the L = Q(@). 


606 Chapter 15 


Since x? — 1 = (x—1)(x?-!+.---+x+ 1), it follows that all the roots of x?~!+---+x+1 are 
of the form w', where 1 <i < p—1.Lemma 15.18 tells us that xP-l4...4x+41 is irreducible 
in Q[x]. Therefore, Theorems 15.3 and 15.17 combine to tell us that 


[L: Q)= p—1=|Gal(L/QI. 


Since L = Q(@), every element of Gal(L/Q) is determined by its action on w and 
Corollary 5.13 tells us that @ must get sent to another root of x?~! +---+.x+ 1. Therefore, the 
only possible candidates to be elements of Gal(L/Q) are the p—1 functions g; which act on 
w as follows 

gi(o) =a, 
for 1 <i < p—1. However, since there are only p— 1 candidates and Gal(L/Q) has exactly 


p—1elements, it follows that Gal(L/Q) = {g1,..., Zp—1}- 
Finally, to see that Gal(L/Q) is abelian, let g;, g; € Gal(L/Q). We now have 


gig j(@) = gi(w!) = g;(@)! = (o')/ =o! = 


(w!)' = g;(@)! = g)(o') = g)8;(@). 


Therefore gg; and g;g; agree on w and so, gig; = gjgi. Thus, Gal(L/Q) is abelian. 


m Examples 


1. Ifo, = cis(), then Q(@ ) is the splitting field of xt +23 +x7+x-+1 over Q and 
Gal(Q(@)/Q) is an abelian group with 4 elements. 


2. Ifao= cis(34), then Q(@z) is the splitting field of x77 4+ x7! 4...+x+1 over Q and 
Gal(Q(@2)/Q) is an abelian group with 22 elements. 


We now prove that another collection of Galois groups is abelian. 


Corollary 15.20. SupposenéN, a€ Q, and let w= cis(2). If L is the splitting field of 
x" —a over Q(@), then Gal(L/Q(@)) is abelian. 


Proof. Let y be a root of x” —a. Then Q(w)(y) must be contained in L. However, the n roots of 
x" —aare y, yw,..., yo" |, and they all belong to Q(w)(y). Hence, L = Q(w)(y). As a result, 
the elements of Gal(L/Q(w)) are completely determined by their action on y. However, 
Corollary 5.13 tells us that every element of Gal(L/K) must send y to another root of x” — a. 


Degrees and Galois Groups of Field Extensions 607 
As a result, if g, € Gal(L/Q(q@)), then there exist i, 7 such that 
g(y) = yo! and h(y) = yo!. 
Remember that g, / are the identity on Q(w), so we now have 


gh(y) = g(yo!) = g(yg(!) = (yo')(w!) = 
yo'T! = (yw!) (o') = h(h(o') = h(yo') = hg(y). 


Since gh and hg agree on y, we see that gh = hg, hence Gal(L/K) is abelian. 


m Examples 


1. Let L be the splitting field of xt —5 over Qi). Then Gal(L/Q(i)) is abelian using 
Corollary 15.20 with n = 4. 


27 


2. Letw= cis(+) and let L be the splitting field of x’ — 12 over Q(w). Then 
Gal(L/Q(@)) is abelian using Corollary 15.20 with n = 7. 


In some earlier examples, we knew the size of Gal(L/K) but did not know if it was abelian. In 
the two preceding examples, we were not concerned with the size of the Galois group, but we 
do know they are abelian. It certainly is not the case that Galois groups of splitting fields are 
always abelian, as we see in 


Corollary 15.21. Let p > 3 be a prime and let a € Q such that x? — a is irreducible in Q|[x]. 
If L is the splitting field of x? — a over Q, then Gal(L/Q) is a nonabelian group with p(p — 1) 
elements. 


Proof. Let y be a root of x? — a and let w = cis(=); the p roots of x? —aare y, yw,..., 


yw?—!. Observe that all p of these roots belong to the field Q(y, w). On the other hand, since 
o= rae it follows that both y and w belong to L. Thus, L = Q(y, ). 


Since y is a root of x? — a, the minimum polynomial for y in Q(w)[x] has degree at most p. 
Thus, [Q(y, w) : Q(@)] < p. In addition, by Lemma 15.18, [Q(@) : Q] = p—1. Therefore, 


[IL : Q) = [QY, @) : Q) = [Q(y, @) : Q@)] -[Q@) : QU = 
[(Q(% @) : Q@)]-(p—1) S p(p—1). 


On the other hand, since x? — a is irreducible in Q[x], 


[L : Q) = [Q @) : QI) = [Qy @) : QY)]- [LQ : A = [QY @) : QI: p. 


608 Chapter 15 


As aresult, [L : Q] is divisible by both p and p — 1. However, p and p — | are relatively 
prime, so [LZ : Q] is divisible by p(p — 1). Combining this with the fact that 

[L : Q] < p(p—1), we now know that [L : Q] = p(p—1). In light of Theorem 15.17, 
we have |Gal(L/Q)| = p(p-—1). 


If g € Gal(L/Q,, then g is determined by its action on y and w. Corollary 5.13 tells us that g 
must send y to one of the p roots of x? —a and also must send a to one of the p— 1 roots of 
xP-!4...4x4+1. Therefore, the only possible candidates to be elements of Gal(L/K) are the 
functions g;,;, where 


gi iY=yo' and g;j)(w) =o’, 


forO <i< p—1,1< j < p—1. However, since |Gal(L/Q)| = p(p-—1), all p(p — 1) of our 
candidates are indeed elements of Gal(L/K). 


To show that Gal(L/K) is not abelian, it suffices to find two elements that do not commute. To 
this end, consider g;,; and g;,2; then we have 


£1,1812() = g11(7@) = g1.1()g1.1(@) = (Yo)w = yo”, 
whereas 
£1.281,1(Y) = 81.2(yo) = 81.2()g1.2(@) = (yo)o” = yo". 


Since the products g1,121,2 and g1,221,; behave differently on y, we know that 
81,181,2 # 81,281,1- Hence, Gal(L/K) is not abelian. 


m Examples 


1. Let L be the splitting field of x? — 20 over Q. This polynomial is irreducible in Q[x] 
using Eisenstein’s Criterion with the prime 5. Corollary 15.21 now applies with the 
prime p = 3. Therefore, Gal(L/Q) is a group that is nonabelian and has six elements. 


2. Now let L be the splitting field of x°> — 22 over Q. Using Eisenstein’s Criterion 
with the prime 2 or 11, we can see that the polynomial is irreducible in Q[x]. 
Corollary 15.21 now asserts, with the prime p = 53, that Gal(L/Q) is a group with 
2756 = 53-52 elements that is not abelian. 


Although the Galois groups in the previous examples are not abelian, in some sense, they 
are not far from being abelian. In Section 8.3, we formalized this notion when we introduced 
the concept of a group being solvable. In fact, we will soon show that the Galois groups in 
Corollary 15.21 are solvable. However, we need 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 609 


Definition 15.22. If G is a group and g,h € G, then let (g, h) = ghg~'h7! and we call this 
element of G the commutator of g and h. 


If we let e denote the identity of G, then you should check that g and h commute if and only 
if (g, h) =e. More generally, observe that a group G is abelian if and only if all commutators 
of elements of G belong to the set {e}. Now suppose Q C K CL C Care fields. If 

g € Gal(L/K), then g is an automorphism of L, which is the identity map on K. Since K 
contains Q, g is also the identity map on Q. Thus, g € Gal(L/Q) and it follows that 


Gal(L/Q) 2 Gal(L/K). 


We now look at a situation somewhat more general than the one in Corollary 15.20. 


Corollary 15.23. Letn e N, a€ Q, and let L and K be the splitting fields over Q of x" —a 
and x" — 1, respectively. Then 


1. QC K CLC Cand Gal(L/Q) D Gal(L/K). 
2. ifg,h € Gal(L/Q), then (g,h) € Gal(L/K). 
3. Gal(L/K) is abelian. 


Before proving Corollary 15.22, let us think about what it actually means. We have the chain 
of groups 


Gal(L/Q) 2 Gal(L/K) 2 {e}, 


with the properties that 
(i) (g,h) € Gal(L/K), for all g,h € Gal(L/Q), and 
(ii) (g,h) € {e}, for all g,h € Gal(L/K). 


By Proposition 8.33, we can see that although Gal(L/Q) might not be abelian, it must be 
solvable. 


We would certainly consider x” — a to be an example of a polynomial in Q[x] whose roots can 
be found using an algebraic algorithm like the quadratic formula. The fact that the Galois 
group of its splitting field over Q is solvable is not a coincidence. In fact, it is at the heart of 
what goes on when we prove the insolvability of the quintic in Chapter 17. We will show that 
if a polynomial in Q[x] is solvable by radicals, then the Galois group over Q of its splitting 
field is a solvable group. Then, we will produce polynomials in Q[x] of degree 5 whose 
splitting fields over Q have Galois groups that are not solvable. Thus, Corollary 15.23 
foreshadows the path we will take in proving the insolvability of the quintic. 


610 Chapter 15 


Proof of Corollary 15.23. For (1), as in the proof of Corollary 15.20, let y be a root of 
x” —a and letw = cis(4). Then K = Q(w) and L = Q(a, y), thus QC K C LC Cand 
Gal(L/Q) 2 Gal(L/K). 


For (2), if g,h € Gal(L/Q), then g~!,h7! € Gal(L/@ and Corollary 5.13 asserts that ¢~!(w) 
and h—'(@) must also be roots of a = x""!4...4x+1. Therefore, g~!(w) =! and 
h-!(w) = w/, where | <i, j <n—1. Observe that g(w’) = w and h(w/) = , and we now have 


(g,h)(@) = ghg'h~'(w) = ghg'(w!) = gh(g_'(@)!) = 
gh((w')/) = gh((w!)') = g((h(w!))') = g(@') =o. 


Therefore, (g, ) is an element of Gal(L/(Q) that is also the identity map on w. Hence, (g, /) is 
the identity on all of K = Q(w), so (g, h) € Gal(L/K). 


Since (3) is actually Corollary 15.20, the proof is complete. 


m Example 


As in the example preceding Definition 15.22, let L be the splitting field of x°> — 22 over 
Q. Then Gal(L/Q) is a group with 2756 = 53-52 elements that is not abelian. However, 


. ‘ 27 
if we let w = cis( 


aoF then we have the chain of groups 


Gal(L/Q) > Gal(L/Q(@)) 2 {e}. 


Since the commutator of any two elements of Gal(L/Q) is an element of Gal(L/Q(@)), 
and the commutator of any two elements of Gal(L/Q(w)) belongs to {e}, 
Proposition 8.33 tells us that Gal(L/Q) is solvable. 


Given fields F C K C L, Theorem 15.6 revealed the relationship between [L : K], [K : F], 
and [L : K]. We conclude this chapter by revealing a relationship between the Galois groups 
Gal(L/K), Gal(K/F ), and Gal(L/F ). This relationship will be essential in proving the 
insolvability of the quintic. 


Corollary 15.24. LetQC FC KCLCC be fields such that K is the splitting field over F 
of some f(x) € F[x]. For every g € Gal(L/F), let gx denote the restriction of g to K. Then 
the function 


w:Gal(L/F) > Gal(K/F ) 


defined as WW(g) = gx, forall g € Gal(L/F ) is a homomorphism of groups. Furthermore, the 
kernel of w is the group Gal(L/K). 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 611 


Proof: The first thing we need to check is that the values of y do indeed belong to Gal(K/F). 
To this end, observe that if g € Gal(L/F ), then gx is an injective homomorphism from K to 
C, which is the identity map on F’. Since K is a splitting field over F of some polynomial in 
F [x], Lemma 15.15 tells us that gx(K) = K. Therefore, gx is an automorphism of K, so 

gx € Gal(K/F ). Thus, wy does send elements of Gal(L/F ) to Gal(K/F). 


Next, we need to show that & is a homomorphism of groups. If g, € Gal(L/F), then w(gh) 
and W(g)W(h) are both elements of Gal(K/F ). In fact, if w € K, then the way we compute the 
values of both w(gh) and w(g)w(h) is to plug @ into the composition gh. Since w(gh) and 
w(g)w(h) agree on all elements of K, w(gh) = w(g)W(A). Thus, y is a homomorphism. 


Finally, if g € Gal(L/F ), then g belongs to the kernel of w precisely if gx is the identity 
map on K. But the automorphisms of L that are the identity map on K are, by the definition of 
the Galois group, the elements of the group Gal(L/K). Thus, Gal(L/K) is indeed the kernel 


of wy. 


m Example 


Once again, we consider the case where L is the splitting field of x°° — 22 over Q. If we let 
o= cis(4), then Q(q@) is the splitting field of x°? — 1 over Q and we can examine the 
chain of fields QC Q@) C L. By Lemma 15.15, the restriction of every g € Gal(L/Q) to 
Q(@) yields an element of Gal(Q(@)/Q). The elements of Gal(L/Q) that act like the 
identity on Q(a) are precisely the elements of Gal(L/Q(@)). 


Back in Definition 5.12, given a commutative ring R with automorphism o, we introduced the 
set R° = {r € R\o(r) =r}. Having discussed Galois groups and extension fields throughout 
this chapter, there are some natural ways to extend this concept. Given fields K C L, if H isa 
subgroup of Gal(L/K), we can look at the elements of L that are fixed by all elements of H. 
In the opposite direction, given a field M lying between K and L, we can look at the elements 
of Gal(L/K) that act as the identity on M. To be more precise, we have 


Definition 15.25. Let K C L be fields and let G = Gal(L/K). If H is a subgroup of G, we let 
L# = {le L\h() =1, for every h € H} and call this set the fixed subfield of L under H. In the 
opposite direction, if M is a field such that K C M CL, we let Gy = {g € G|g(m) =m, 

for every m € M}. 


Although we refer to L” as a fixed subfield, we have not yet shown that it is actually a field. In 
the other direction, although we have not shown that Gy is a subgroup of G, you might guess 
that it is. This leads us to 


612 Chapter 15 


Proposition 15.26. Let K C L be fields and let G = Gal(L/K). If H is a subgroup of G, then 
Ly is a field such that K € L? CL. Furthermore, if M is a field such that K C M CL, then 
Gy is a subgroup of G. 


Proof. In order to show that L? is a field, we first need to use Proposition 5.15 to show that 
L is acommutative ring. Every element of H certainly acts as the identity map on K, hence 
K CL". Since 0,1, —1 € K, it follows that 0,1, —1 € L” and part (b) of Proposition 5.15 

is satisfied. Therefore, to satisfy part (a) of Proposition 5.15 and to complete the proof 

that L” is a commutative ring, it suffices to show that L” is closed under addition and 
multiplication. 


Suppose x, y € L”, we must show that x + y,x-ye L" If h € H, we know that h(x) = x and 
h(y) = y. Since h is an automorphism, it follows that 


h(xt+ty)=h(x) +h(y) =x+y 
and 
h(x-y) =h(x)-h(y) =x-y. 


Since the previous equations hold for all h € H, we have x+y, x-ye L”. 


To finish showing that L® is a field, we must show that if x is a nonzero element of L”, then 
x! e L”. Since x € L, we know that x! € L. However, we still need to show that if h € H, 
then h(x!) = x7!. Since h(x) = x and h is an automorphism, we have 


1=h(1) =h(x- x!) =h(e) -h(xe!) = x- hv}. 
Multiplying the far ends of the previous equation by x~! results in x~! = h(x~'). Thus, 


x! e L” and L# is indeed a field. 


In the other direction, in order to prove that Gy is a subgroup of G, Proposition 8.6 asserts 
that it suffices to show that if g, h € Gy, then we also have gh~! € Gy. To show that 

gh—! € Gy, we need to verify that gh~!(m) = m, for all m € M. If m € M, we have g(m) =m 
and h(m) = m. Observe that the last equation also tells us that h~!(m) = m. We now have 


gh-'(m) = g(h'(m)) = g(m) =m, 


thereby concluding the proof. 


Degrees and Galois Groups of Field Extensions 613 


We examine the correspondence from Definition 15.25 in the following example. 


m Example 


Let L be the splitting field of (x? — 30)(x?+ 1) over Q. It is easy to see that 
L = Q(v30, i). Observe that 


[L: Q) =| O(v30, i) : | = [ O(v30, ’) : O(v30)]-[ O(v30) : Q]. 


Since Q(V/30) C Randi ¢R, it follows that i ¢ Q(V30) and [Q(V30, i) : Q(V/30)] = 2. 
Furthermore, we know that [Q(/30) : Q] = 2, and our observations now combine to tell 
us that [LZ : Q] =4. In light of Theorem 15.17, we also know that |Gal(L/Q)| = 4. 


The behavior of each g € Gal(L/Q) is completely determined by its action on 30 and i. 
Corollary 5.13 now asserts that if g € Gal(L/Q) then g(/30) must be a root of x” — 30 
and g(i) must be a root of x* + 1. This leaves us with four candidates to be elements of 
Gal(L/Q) and they are 21, 22, 23, g4, where 


gi(V30) = V30 and gi(i) =i, 
g2(V30 
3(V30) = V30 and g3(i) = —i, 
g4(V30) = —V30 and ga(i) = —i. 


= —J/30 and g»(i) =i, 


—S_ 


Since |Gal(L/Q)| = 4, all four of the preceding candidates are indeed elements of 
Gal(L/Q), so Gal(L/Q) = {81, 82, 83, g4}. By examining how compositions of the 
various gj; act on 30 and i, we obtain the following table for Gal(L/Q). 


° §1 §2 §3 &4 
&1 &1 82 §3 &4 
§2 §2 §1 &4 8&3 
&3 &3 &§4 &1 &2 


&4 &4 §3 §2 &1 


If we let G = Gal(L/Q), then a brief examination of the table for G indicates that G has 
exactly five subgroups and they are 


A, = {gi}, Ho ={81, 92}, H3={21,83}, Hs={91, 84}, Hs =G. 


www.elsevierdirect.com 


614 Chapter 15 


The set {1, ¥30, i, iV 30} is a basis for L over Q, so every element of L can be expressed 
uniquely as 


a+bV30+ci+ div30, 


where a, b, c,d € Q. By examining the action of the automorphisms in G on elements of 
this type, you can see which elements of L are fixed by all the automorphisms in the five 
different subgroups of G. Doing the required computations, you should find that 


L™=L, L® ={at+cila,c € Q} =Qi), 
L%® = {a+bV30la,b € Q| = Q(V30), 
L™ = {a+div¥30|a,d € Q| = Q(iv30), 
L's =Q. 


It certainly appears that it is not easy to find all the fields that lie between Q and L. This 
is an issue we will deal with later, but, for now, let us concern ourselves with the five fields 


M, =L, My =Qii), M3 = Q(V30), Ms = Q(iV30), Ms = Q. 


By checking which automorphisms from G act as the identity map on the various M;, it 
follows that 


Gu, ={gi1} =, Gu, = {81, 82} = Ho, Gu, = (81, 93} = As, 
Gm, = {81.84} = Ha, Gu, = (81, 82, 83, 84} = H5 =G. 


In this example, it turns out that the correspondence that goes from H to L# is actually a 
bijection from the set of subgroups of Gal(L/(Q) to the set of fields that lie between Q 
and L. This is no coincidence, as a fundamental result in Galois theory states that if 
QCK CLC Care fields such that L is the splitting field of some f(x) € K[x], then the 
correspondence from H to L” described in Definition 15.25 is indeed a bijection between 
the subgroups of Gal(L/K) and the fields that lie between K and L. Furthermore, the 
other correspondence in Definition 15.25 from M to Gy Is the inverse of the bijection 
from H to L". 


Observe that if Q C K C L C C are fields, then there is no obvious way to go about finding all 
the fields that lie between K and L. On the other hand, if L is a finite extension of K, then 
Gal(L/K) is a finite group and it might not be too difficult to find all of its subgroups. In the 
case that L is the splitting field over K of some f(x) € K[x], then the beautiful correspondence 
in Definition 15.25 from H to L” indicates that knowing all the subgroups of Gal(L/K) 


www-.elsevierdirect.com 


Degrees and Galois Groups of Field Extensions 615 


enables you to find all the fields that lie between K and L. Therefore, in our previous example, 
the correspondence from H to L” actually tells us that there are exactly five fields that lie 


between Q and Q(V30, i). 


We will not prove this fundamental result as the proof requires developing more Galois theory 
than is needed to prove the insolvability of the quintic. However, in the exercises, you will get 
a great deal of practice doing computations and working with this important correspondence 
between subgroups of Gal(L/K) and fields that lie between K and L. 


Exercises for Sections 15.2 and 15.3 


In exercises 1-25, we let L be the splitting field over Q of x° — 10. 

1. Show that L = Q(103, w), where w = cis(). 

2. Show that [L : Q] = 6 and determine the size of Gal(L/Q). 

3. If g € Gal(L/Q), show that 2(103) is equal to either 103, 103, or 103”. 
4. If g € Gal(L/Q), show that g(q@) is equal to either w or aw. 
5 


Show that the set {1, 103, 103, Q, 103w, 103w} is a basis for L over Q and then explain 
why the action of every g € Gal(L/Q) is determined by its behavior on 103 and w. 


6. If we define g;,; as gi, (103) = 103! and gj j(w) = o/, for 0 <i <2 and 1 < j <2, show 
that the six functions of the form g; ; make up all the elements of Gal(L/Q). 


Exercises 7-25 will examine the correspondences discussed in Definition 15.25 between the 
subgroups of Gal(L/Q) and the fields that lie between Q and L. When performing 
computations with the various subgroups and fields involved with these correspondences, you 
should feel free to use the following table for Gal(L/Q), as well as the basis for L over Q 
provided in exercise 5. 


° 80,1 80,2 &1,1 &1,2 82,1 §2,2 
80,1 80,1 §0,2 §1,1 &§1,2 §2,1 §2,2 
80,2 80,2 80,1 82,2 82,1 81,2 81,1 
81,1 &1,1 81,2 82,1 82,2 80,1 80,2 
§1,2 §1,2 §1,1 &0,2 80,1 §2,2 §2,1 
§2,1 82,1 §2,2 80,1 80,2 §1,1 §1,2 
§2,2 82,2 §2,1 §1,2 &1,1 §0,2 &0,1 


7. If Hj is the subgroup {go,1}, find the elements of the field EA, 
8. Compute [L: L”'] and [L”™ : Ql]. 


616 Chapter 15 


9. 
10. 
11. 
12. 
13. 
14. 
15. 
16. 
17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 
25. 


If H is the subgroup {g0,1, 20,2}, find the elements of the field Lin, 
Compute [L : L”?] and [L” : Q]. 

If H3 is the subgroup {g0,1, 21,2}, find the elements of the field EPs, 
Compute [L: L*3] and [L™ : Q]. 

If H4 is the subgroup {g0,1, 22,2}, find the elements of the field EM. 
Compute [L : L*) and [L™ : Q]. 

If Hs is the subgroup {g0,1, 21,1, 82,1}, find the elements of the field LAs, 
Compute [L : L”5] and [L” : Q]. 

If He is all of Gal(L/Q), find the elements of the field L's, 
Compute [L: Ls] and [L™ : Q)]. 

If M; = Q, find the elements of the subgroup Gy,. 

If M2 = Q(@), find the elements of the subgroup Gy,. 

If M3 = Q(103a), find the elements of the subgroup Gy,. 


If M4 = Q(103w?), find the elements of the subgroup G yy. 


If M5 = Q(103), find the elements of the subgroup Gy,. 
If Me = L, find the elements of the subgroup Gy,. 
Is Gal(L/Q) abelian? If not, is it solvable? 


In exercises 26-47, we let L be the splitting field over Q of x*+ 1. 


26. 
27. 
28. 
29. 


30. 


Show that L = Q(w), where w = cis(4). 
Show that [L : Q] = 4 and determine the size of Gal(L/Q). 
If g € Gal(L/Q), show that g(w) is equal to either w, wo, —w, or —@”. 


Show that the set {1, w, w*, w*} is a basis for L over Q and then explain why the action 
of every g € Gal(L/Q) is determined by its behavior on o. 


If we define e, g,h, k as e(w) = @, g(w) = w*, h(w) = —a, and k(w) = —w?, show that 
the four functions of the form e, g, i, k make up all the elements of Gal(L/Q). 


Exercises 31-47 will examine the correspondences discussed in Definition 15.25 between the 
subgroups of Gal(L/Q) and the fields that lie between Q and L. When performing 
computations with the various subgroups and fields involved with these correspondences, you 


Degrees and Galois Groups of Field Extensions 


should feel free to use the table below for Gal(L/Q) as well as the basis for L over Q 
provided in exercise 29. 


31. 
32. 
33. 
34. 
35. 
36. 
37. 
38. 
39. 
40. 
41. 
42. 
43. 
44. 
45. 
46. 
47. 


O° 


~~ SI 

rT SA ®B8 A 
am TF B® GW OO 
OQ KB rr > 
Ran Fr 


If Hj is the subgroup {e}, find the elements of the field L™'. 
Compute [L : LL] and [L”™ : Q]. 

If Hy is the subgroup {e, g}, find the elements of the field L””?. 
Compute [L : L”?] and [L” : Q]. 

If H3 is the subgroup {e, h}, find the elements of the field L”. 
Compute [L : L*3] and [L® : Q]. 

If H4 is the subgroup {e, k}, find the elements of the field L“. 
Compute [L : L"] and [L™ : Q]. 

If Hs is all of Gal(L/Q), find the elements of the field L%. 
Compute [L : L”5] and [L” : Q]. 

Show that L contains i, ae and i/2. 

Determine the values of the automorphisms g, h, k on i, J/2, and i/2. 
If M; = Q, find the elements of the subgroup G y,. 

If M, = Q(iv2), find the elements of the subgroup Gy,. 

If M3 = Q(i), find the elements of the subgroup Gy,. 

If My = Q(v2), find the elements of the subgroup Gy. 

If Ms = L, find the elements of the subgroup G y,. 


In exercises 48-102, we let L be the splitting field over Q of (4? = 5) (7-7)? 1), 


48. 
49. 
50. 
51. 


Show that L = Q(/5, V7, i). 

Show that [ZL : Q] = 8 and determine the size of Gal(L/Q). 

If g € Gal(L/Q), show that g(V/5) is equal to either V5 or —V5. 
If g € Gal(L/Q), show that g(V7) is equal to either VT or—s/7. 


617 


618 Chapter 15 


52. 
53. 


54. 


Exercises 55—102 will examine the correspondences discussed in Definition 15.25 between the 


If g € Gal(L/Q), show that g(i) is equal to either i or —i. 


Show that the set {1, /5, /7, 35, i, iV/5, iV7, ixv/35} is a basis for L over Q, and then 
explain why the action of every g € Gal(L/Q) is determined by its behavior on /5, /7, 


and i. 


If we define gjx1 a8 gjx(V/5) = (-1)I V5, gini(V/7) = (-1)F V7, and 
Sik) = C154, for 0 < j,k, / < 1, show that the eight functions of the form gj; make 


up all the elements of Gal(L/Q). 


subgroups of Gal(L/Q) and the fields that lie between Q and L. When performing 


computations with the various subgroups and fields involved with these correspondences, you 
should feel free to use the table below for Gal(L/Q) as well as the basis for L over Q 


provided in exercise 53. 


55. 
56. 
57. 
58. 
59. 
60. 
61. 
62. 
63. 
64. 
65. 


&0,0,0 80,0,1 §80,1,0 80,1,1 81,00 81,0,1 81,1,0 
§0,0,0 &0,0,0 80,0,1 §80,1,0 80,1,1 81,00 81,0,1 81,1,0 
80,0,1 80,0,1 80,00 80,1,1 80,1,0 81,0,1 81,00 81,1,1 
§0,1,0 &0,1,0 80,1,1 80,00 80,01 81,1,0 §81,1,1 81,0,0 
§0,1,1 &0,1,1 80,1,0 80,0,1 80,00 81,1,1 81,1,0 81,0,1 
§1,0,0 &§1,0,0 §81,0,1 81,1,0 81,1,1 80,00 80,0,1 80,1,0 
&1,0,1 &1,0,1 81,00 81,1,1 81,1,0 80,0,1 80,0,0 80,1,1 
§1,1,0 &1,1,0 §81,1,1 81,00 81,01 80,1,0 80,1,1 80,0,0 
81,1,1 &1,1,1 81,1,0 81,01 81,00 80,1,1 80,1,0 80,0,1 


If H, is the subgroup {g0,0,0}, find the elements of the field LA, 
Compute [L : L”'] and [L™ : Ql]. 


If H> is the subgroup {g0,0,0, o,0,1}, find the elements of the field L2, 


Compute [L : L*2] and [L™® : Q]. 


If H3 is the subgroup {g0,0,0, 80,1,0}, find the elements of the field bey 


Compute [L : L*3) and [L*8 : Q]. 


If Hy is the subgroup {g0,0,0, 80,1,1}, find the elements of the field Eee 


Compute [L : L”*] and [L™ : Q]. 


If Hs is the subgroup {g0,0,0, 21,0,0}, find the elements of the field Ee: 


Compute [L : L*5] and [L*5 : Q]. 


If Hg is the subgroup {g0,0,0, 21,0,1}, find the elements of the field Ls, 


&1,1,1 
&1,1,1 
&1,1,0 
&1,0,1 
&1,0,0 
80,1,1 
80,1,0 
80,0,1 
&0,0,0 


66. 
67. 
68. 
69. 
70. 
71. 
72. 
73. 
74. 
75. 
76. 
77. 
78. 
79. 
80. 
81. 
82. 
83. 
84. 
85. 
86. 
87. 
88. 
89. 
90. 
91. 
92. 


Degrees and Galois Groups of Field Extensions 619 


Compute [L : L”] and [L” : Ql]. 

If H7 is the subgroup {g0,0,0. 21,1,0}, find the elements of the field L', 

Compute [L : L”7] and [L” : Q]. 

If Hg is the subgroup {g0,0,0. 21,1,1}, find the elements of the field L's, 

Compute [L : L”8] and [L” : Q]. 

If Ho is the subgroup {20,0,0, 80,0,1» 20,1,0> 20,1,1}, find the elements of the field Lee 
Compute [L : L®] and [L™ : Q]. 

If Hj is the subgroup {20,0,0. 80,0,1+ 21,0,0+ 21,0,1}, find the elements of the field Lio, 
Compute [L : L410} and [L#° : Q]. 

If A, is the subgroup {0,0,0, 20,0,1, 21,1,0> 21,1,1}, find the elements of the field Ean, 
Compute [L : L#"] and [L?" : Q]. 

If H12 is the subgroup {20,0,0, 80,1,0, 21,0,05 21,1,0}, find the elements of the field LA, 
Compute [L : L112] and [L722 : Q]. 

If H13 is the subgroup {20,0,0. 80,1,0+ 21,0,1» 21,1,1}, find the elements of the field Ls, 
Compute [L : L'3] and [L73 : Q]. 

If Hy4 is the subgroup {20,0,0. 80,1,1+ 21,0,0+ 21,1,1}, find the elements of the field Lis, 
Compute [L : L”'4] and [L#"4 : Q]. 

If Hj5 is the subgroup {20,0,0. 80,1,1+ 21,0,1» 21,1,0}, find the elements of the field LAs, 
Compute [L : L415] and [Ls : Q]. 

If Ho is all of Gal(L/Q), find the elements of the field L"'. 

Compute [L : L416} and [L“* : Q]. 

If M; = Q, find the elements of the subgroup Gy,. 

If M, = Q(iV/35), find the elements of the subgroup Gy,. 

If M3 = Qiv7), find the elements of the subgroup Gy,. 

If M4 = Q(iv5), find the elements of the subgroup Gy. 

If Ms = Q(i), find the elements of the subgroup Gy,. 

If Me = Q(/35), find the elements of the subgroup Gi. 


620 Chapter 15 


93. If M7 = Q(V7), find the elements of the subgroup G y,. 

94. If Mg = Q(¥5), find the elements of the subgroup G yg. 

95. If Mo = Q(iV5, iV7), find the elements of the subgroup Gy. 

96. If Mj = Q(V335, i), find the elements of the subgroup Gyo. 

97. If My =Q(V7, i/5), find the elements of the subgroup Gm. 

98. If My. = Q(V7, i), find the elements of the subgroup Gy,,. 

99. If M3 = Q(V/5, i/T), find the elements of the subgroup Gy,,. 
100. If My4= Q(V5, i), find the elements of the subgroup G y,,. 


101. If Mys5 = Q(/5, V7), find the elements of the subgroup Gy,;. 
102. If Mie = L, find the elements of the subgroup Gy,,. 


For exercises 103-105, please read the following: 


In Theorem 15.12, we showed that if Q C K C L C C are fields such that [L : K] is finite, then 
L= K(Q@), for some 6 € L. In exercises 103-105, we will show that there exist fields K C L, 
with [L : K] finite, such that L is not a simple extension of K. Thus, the hypothesis in 
Theorem 15.12 that K contain Q is needed. In doing these exercises, you might want to first 
review the final example from Section 12.5 as well as exercises 19 and 21 after Section 12.5. 


103. Let K =Z)(t?, 2”), E=Zp(t?, tr), and L = Zp(t, t2). Observe that K, E, L are 
fields wih K CE CL. 
(a) Compute [E: K] and [L: E]. 


(b) Use part (a) to find [L : K]. 
104. (a) Show that if a € Z,[t), t2], then a? € Z,[t?, t2?]. 


(b) If K CL are as in exercise 103, use part (a) of this exercise to show that if 6 € L 
then B? € K. 


(c) Show that if 6 € L, then [K(@) : K] is equal to 1 or p. 


105. Use exercises 103 and 104 to show that if 6 € L, then L 4 K(@). This shows that 
although L is a finite extension of K, it is not a simple extension of K. 


For exercise 106, please read the following: 


Exercises 1-25, 26-47, and 48-102, dealt with a series of questions about the splitting fields 
over Q of the polynomials xg = 10) 9° 12nd G5) = 7)" + 1), respectively. For 
additional practice with splitting fields, Galois groups, and fixed fields, in exercise 106, we 


Degrees and Galois Groups of Field Extensions 621 


will examine the splitting field of x+ — 2 over Q. In this exercise, the Galois group will be a 
group we have seen before, but we have not as yet seen it arise as the Galois group of a 
splitting field over Q. Before beginning this exercise, keep in mind that it has many, many 
parts and could easily have been split into over 40 separate exercises. 


106. Let L be the splitting field of x* — 2 over Q. 


(a) 
(b) 


(c) 
(d) 
(e) 
(f) 


(g) 


Show that L = Q(22, i) and then find [ZL : Q] and |Gal(L/Q)|. 


Describe the behavior of each element of Gal(L/Q) on 23 and i and then find the 
complete table for Gal(L/Q). 


To what familiar group is Gal(L/Q) isomorphic? 
Use the table for Gal(L/Q) to find all subgroups of Gal(L/Q). 
For every subgroup H of Gal(L/Q), find the fixed field L”. 


For each subgroup H of Gal(L/Q), check if H is a normal subgroup of Gal(L/Q) 
and also check if the corresponding fixed field L” is the splitting field of some 
polynomial over Q. 


Was there any relationship in part (f) between when H is normal and when L” is a 
splitting field? 


This page intentionally left blank 


Geometric Constructions 


In the final two chapters, we settle the two great “impossibility” problems introduced in 
Chapter 1—namely, trisecting angles with a ruler and compass and finding a formula for the 
roots of fifth-degree polynomials. For both problems, we will be confronted with the difficult 
issue of how we go about proving that something cannot be done. 


Proving that angles cannot be trisected with a ruler and compass will be done in three steps: 


I. Formally define what it means for a real number to be constructible with ruler and 
compass. 


II. Show that the degree of a field extension over Q of a field generated by a constructible 
real number must be 2”, where n > 0 is an integer. 


Ill. Show that if angles could be trisected, then numbers can be constructed that do not 
generate extensions of Q of degree 2”. 


16.1 Constructible Points and Constructible Real Numbers 


In Chapter 2, we showed that depending on our definition of the word ruler, it is possible to 
trisect angles. Therefore, in this chapter, we need to be very precise about the meaning of the 
terms we use when proving that various geometric objects cannot be constructed. Not only 
must we be very careful about what we mean by ruler and compass, but we will also need to 
carefully describe exactly how the ruler and compass are allowed to be used. 


We begin with the xy-plane and designate the points (0, 0) and (1, 0) on the x-axis. 


This is done for two reasons. First, the only time we will be allowed to use either the ruler or 
compass is when we draw a line or circle with points that have previously been constructed. 
Therefore, in order to get started, we need two initial points, (0, 0) and (1, 0), that we already 
consider to be constructed. Second, as we construct additional points in the plane beyond 

(0, 0) and (1, 0), we will need to determine the x and y coordinates of these points. By 
beginning with the points (0, 0) and (1, 0), we associate the number | to the distance between 
these two particular points, and this will be used to determine the coordinates of other points 
as they are constructed. 


Copyright © 2010 by Elsevier Inc. All rights reserved. 623 


624 Chapter 16 


(0, 0) (1, 0) x 


Allowable Moves: Let Po = (0,0) and P; = (1, 0) be the initial constructible points. 


1. The ruler may only be used to draw a line connecting two points that have already been 
constructed. 


2. The compass may only be used to draw a circle using two points that have already been 
constructed, where one of the points is the center of the circle and the other lies on the 
circumference of the circle. 


We consider a point in the plane to be constructible if it is the intersection of 
(a) two lines, 

(b) two circles, or 

(c) one line and one circle, 

both of which have been drawn according to the allowable moves. 


More formally, we record this as 


Definition 16.1. A point P in the xy-plane is constructible if there exists a sequence of points 
Po = (0,0), P; = (1,0), Pap cecg Pao P, 


such that every P,, for 2 <i <n, is the intersection of either two lines, two circles, or one line 
and one circle that have been constructed using only the points Po, P,,..., Pi—1. 


This immediately leads us to 


Definition 16.2. A real number a is constructible if it is the x or y coordinate of a 
constructible point. 


Geometric Constructions 625 


Observe that our list of allowable moves eliminates any ambiguity about how the ruler can 
be used. All markings on the ruler are to be ignored, so the ruler can only be used like a 
straightedge. In particular, the “trisection” done in Chapter 2 uses a move that is clearly 
not permitted. 


When you use a compass, you might occasionally draw a circle and then move the compass to 
another location to draw a new circle using the same radius. If you have done this, then you 
have allowed the compass to keep the same radius as it moved from one location to the next, 
and, in this case, we say that you have used a rigid compass. However, we do not assume that 
our compass is rigid, so preserving the radius when the compass is moved is not an allowable 
move. 


On the other hand, it is now reasonable to ask whether a sequence of allowable moves will 
allow us to move the compass from one constructible point to another while preserving the 
radius. If the answer is yes, then our allowable moves will enable us to perform all the 
constructions that can be done with a rigid compass. In order to show that the answer is indeed 
yes and to also develop a sense of which real numbers are constructible, we begin with four 
basic constructions. 


Basic construction #1—Constructing a perpendicular line from a point on a line 
Given the line connecting points A and B, we will construct the line perpendicular to this one 
at the point A. 


(a) Draw the circle with center A having B on the circumference and let C denote the other 
point where this circle intersects the line connecting A and B. 


626 Chapter 16 


(b) Draw the circle centered at B with C on the circumference and also draw the circle 
centered at C with B on the circumference. Let D be either of the intersection points of 
these two circles. 


(c) Draw the line connecting D and A. 


Geometric Constructions 627 


Observe that BCD is an equilateral triangle and triangles ABD and ACD are congruent. The 
line connecting A and D is perpendicular to the line connecting A and B at the point A. In fact, 
the line connecting A and D is the perpendicular bisector of the line segment BC. 


Basic construction #2—Dropping a perpendicular line to a line from a point off the line 
Given a point C, not on the line connecting points A and B, we will construct the line through 
C that is perpendicular to the line connecting A and B. 


Cc 
e 


(a) Draw the circle with center C having A on the circumference and then let D denote the 
point, other than A, where this circle intersects the line connecting A and B. 


628 Chapter 16 


(b) Draw the circles centered at A and at D that both have C on the circumference and then 
let E denote the point, other than C, where these two circles intersect. 


(c) Draw the line connecting the points C and E. It passes through C and is perpendicular to 
the line connecting A and B. 


Observe that ACDE is a rhombus, triangles ACE and DCE are congruent, and the line 


connecting C and E is the perpendicular bisector of the line segment AD. 


In part (a) of the previous construction, we let D denote the second point where the circle 
centered at C intersected the line connecting A and B. You should convince yourself that it is 


Geometric Constructions 629 


possible that there is no second intersection point, but this happens if and only if the line 
connecting C and A is perpendicular to the original line. 


Basic construction #3—Constructing a line parallel to a given line through a point off the line 


Given a point C, not on the line connecting points A and B, we will construct the line through 
C that is parallel to the line connecting A and B. 


Cc 
e 


i —s 


A B 


(a) Using basic construction #2, drop a perpendicular line from C to the line connecting A 
and B and let D denote the point where these two lines intersect. 


ec 


(b) Using basic construction #1, construct the line perpendicular to the line connecting C and 
DatC. 


630 Chapter 16 


Observe that the line constructed in part (b) and the line connecting A and B are both 
perpendicular to the line connecting C and D. Therefore, the line constructed in part (b) is 
parallel to the original line. 


Basic construction #4— Transporting the distance between two constructible points to use as the 
radius of a circle centered at a third constructible point 

Given points A, B, C, we will draw a circle centered at C whose radius is equal to the distance 
between A and B. 


ew 
e® 


(a) Draw the line connecting A and B, and then draw the line connecting A and C. 


(b) Using basic construction #3, construct the line through C that is parallel to the line 
connecting A and B, and then, using basic construction #3, construct the line through B 
that is parallel to the line connecting A and C. Let D denote the intersection of these 
two lines. 


Geometric Constructions 631 


(c) Draw the circle centered at C with D on the circumference. 


Observe that ABDC is a parallelogram. Therefore, line segments AB and CD have the same 
length. As a result, the radius of the circle centered at C with D on the circumference is equal 
to the distance between A and B. 


The previous construction assumes that C does not lie on the line connecting A and B. You 
should convince yourself that if A, B, C do lie on the same line, then you can construct a point 


632 Chapter 16 


B’ such that the line segments AB and AB’ have the same length and B’ is not on the line 
connecting A and C. Therefore, the preceding construction will still work after replacing the 
point B with B’. 


Since each of our four basic constructions has been shown to be a short sequence of allowable 
moves, we can now freely apply the basic constructions to generate many constructible 
numbers. 


Theorem 16.3 


(a) Ifa, B € R are constructible, then so are a+ B and a — B. 


(b) Jfa, B € R are constructible, with a > 0 and B > 0, then aB and 3 are both constructible. 


(c) Ifae€ Ris constructible and a > 0, then ./a is also constructible. 


Proof. We should first observe that by drawing the line through the points (0, 0) and (1, 0) and 
then constructing the line perpendicular to this one at (0, 0), our allowable moves and basic 
constructions have allowed us to draw both the x- and y-axes. 


For part (a), since w is constructible, it is a coordinate of a constructible point P. For the 
moment, let us assume that @ is the x coordinate of P. We can drop a perpendicular from 

P to the x-axis, and it will intersect the x-axis at the point (a, 0). We can now draw the circle 
centered at (0, 0) with (a, 0) on the circumference, and it will intersect the x- and y-axes 

at the points (+a, 0) and (0, +a). 


Observe that if w was the y coordinate of P, then we would have begun by first dropping a 
perpendicular from P to the y-axis. The same reasoning as in the previous paragraph would 
once again show that points (+a, 0) and (0, +q@) are all constructible. It now follows that the 
points (+6, 0) and (0, +8) are also constructible. Next, basic construction #4 allows us to 
draw a circle centered at (a, 0) with radius |6|. Observe that it will intersect the x-axis at the 
points (w+ 6, 0). Thus, a+ 6 and a — £ are both constructible. 


For part (b), we know that our arguments in part (a) have told us that the points A = (8, 0) 
and B = (0, @) are constructible. Draw the line connecting P; = (1,0) to B, and then draw 
the line parallel to this one through A. If we let C denote the point where the line drawn 
through A intersects the y-axis, then triangles Pp P; B and PoAC are similar. As a result, the 
corresponding sides are in proportion and if we let y be the y coordinate of C, 

we have 


Geometric Constructions 633 


Therefore, y = af and since y has been shown to be constructible, we can conclude that wf is 
constructible. 


Next, we let A and B be as in the previous construction, but we now draw the line connecting 
A and B and then draw the line parallel to this one through P,. If we let D be the point where 
the line drawn through P; intersects the y-axis, then the triangles Py P; D and Py AB are 
similar. Since the corresponding sides are in proportion, when we let 6 be the y coordinate 

of D, we have 


Therefore, 6 = 3 and since 6 has been shown to be constructible, we see that 3 is constructible. 


For part (c), if we let A = (a, 0) and B = (w+ 1,0), then A and B are both constructible. 
Recall that our first basic construction actually showed how to construct the perpendicular 
bisector of a line segment. If we let C be the point where the perpendicular bisector of the line 
segment Po B intersects the x-axis, then C = (se, 0). Next, draw the circle centered at C with 
Po on the circumference. Then draw the line perpendicular to the x-axis at A and let D denote 
the point in the first quadrant where this line intersects the circle centered at C. 


634 Chapter 16 


Observe that 7 Py DB is a right angle, so the triangles A P)D and ADB are similar to 
ZAPy»D= ZADB and ZADPy = ZABD. If we let 6 be the y coordinate of D, then f is the 
length of line segment AD, a is the length of the line segment PA, and | is the length of the 
line segment AB. 


Since the lengths of the corresponding sides of similar triangles are in proportion, we have 


This immediately implies that 6? = a, so B = /a. Since f is the y coordinate of a 


constructible point, it is constructible. As a result, ,/a is constructible, as required. 


In light of Theorem 16.3, we easily obtain 


Corollary 16.4. The set of constructible numbers is a field lying between Q and R. 


Proof. Since Po = (0,0) and P; = (1, 0) are constructible points, 0 and 1 are constructible 
numbers. Theorem 16.3(a) asserts that beginning with 0, whenever we add or subtract copies 
of 1, we continue to obtain constructible numbers. Therefore, every element of Z is 
constructible. Every positive element of Q is the quotient of elements of N, so 

Theorem 16.3(b) tells us that every positive rational number is constructible. However, 
Theorem 16.3(a) also tells us that the additive inverse of every constructible number is 
constructible, so every element of Q is constructible. Since the set of constructible numbers 
was defined as a subset of R, it follows that this set lies between Q and R. 


The set of constructible numbers is contained in the field R, so in order to show that it is a 
subfield of R, it suffices to show that it is a group under addition and its nonzero elements are 
a group under multiplication. In light of Proposition 8.6(a), to show that the constructible 


Geometric Constructions 635 


numbers are a group under addition, we only need to show that a — £ is constructible 
whenever a and 6 are. However, Theorem 16.3(a) certainly tells us that this is indeed the case. 


If a, 6 are nonzero constructible numbers, then Theorem 16.3(b) shows that lal | 


iB} 
constructible. Thus, using Theorem 16.3(a), we know that tt and — Gt are both constructible. 
Since & is equal to either ft or =a it follows that 2 Z is certainly constructible. Therefore, 


Proposition 8.6(a) now applies to tell us that the nonzero constructible numbers are a group 
under multiplication and the set of constructible numbers is indeed a subfield of R. 


Theorem 16.3 and Corollary 16.4 enable us to easily generate constructible numbers. 


m Examples 


The following numbers are all constructible: 


V2+75 
V2-V/5° 


3, 4/2. 4/5, a/ 2-4/5, 


VV24V3, VV24+V5+V3 


It takes some work, but it can be shown that the degrees of the minimum polynomials over Q 
of the preceding seven numbers are, respectively, 1,2, 2,4, 4, 8, 16. Observe that each of these 
degrees is a power of 2. It certainly leads one to conjecture that if w € R is constructible, then 
[Q(a) : Q] = 2”, where n > O is an integer. In order to prove this, we need to start translating 
some of our observations about constructions into facts about field extensions. 


Lemma 16.5. Let K be a field such that Q C K CR and suppose (a, B,) and (a2, B2) are 


constructible points such that 01, 2, Bi, B2 € K. 


(a) Iflis the line connecting (a, B,) and (a2, Bz), then | has equation ax + by = c, where 
a,b,c eK. 


(b) If C is the circle centered at (a, B,) with (a2, Bz) on the circumference, then C has 


equation (x — a)? +(y—b)* =c, where a,b,c € K. 


Proof. For part (a), if / is vertical, then it has equation x = a, which is certainly of the desired 
form. On the other hand, if / is not vertical, then a2 4 a and the point-slope formula tells us 
that / has equation 


y= p= (BP) a1). 


a} 


636 Chapter 16 
This equation can be rewritten as 


(Bz — Bi) x + (a) — a2) y = 1 Bp — 2 By 


and since 67 — B,, @1 — @2, a; h2 —a26, € K, this equation is of the desired form. 


The distance between (a1, 61) and (a, Bz) is \/ (a2 — a)? + (B2 — B1)2, so the equation of C is 


(x— a1)? + (y— Bi)? = (@2 — a1)? + (Bo — Bi)”. 


Since a, By, (a2 — a1)”, (Bo — Bi)” € K, this equation is also of the desired form. 


We can now prove the main result of this section. 


Theorem 16.6. Jf a € R is constructible, then [Q(a) : Q] = 2! where 1> 0 is an integer. 


Proof. If a is constructible, then Definitions 16.1 and 16.2 tell us that w is one of the 
coordinates of a point P such that there is a sequence of points 


Po = (0,0), Pi) = (1,0), Po,..., Ph =P 


and, for 2 <i <n, every P; is the intersection of either two lines, two circles, or one line and 
one circle that have been constructed using only the points Po, P;,..., Pj-1. 


We now use the sequence of points to construct a sequence of fields 
K, =Q, Kp = Ki (a, fo), K3 = K2(@3, B3),..., Kn = Kn—1 (Qn, Bn) 


where, for 2 <i <n, P; = (q;, B;). Observe that, for 2 <i <n, K; is the smallest field that 
contains Q as well as the coordinates of the points Po, P},..., P;-1, P,. 


Using Theorem 14.6 and the fact that K; = Q, we have 


(1) [Kn : Q) = [Kn ? Kn-1)-[Kn-1 : Kn-2]-++[K3 2 K2]-[K2: Ki]. 


We now claim that, for 2 < t <n, [K; : K;-1] is equal to 1 or 2. Note that K;, is the smallest 
field that contains K;_1, a;, and ;. In addition, P, is the intersection of two lines, a line and a 
circle, or two circles, where the lines and circles are drawn using points whose coordinates 
belong to K;_;. Our proof now branches out into three cases. 


Case I—Two Lines: By Lemma 16.5, P; is the intersection of two lines with equations 
ayx+byy=c, and a2x + boy = co, where aj, az, b}, b2, C1, C2 € K;_1. Since the two lines are 
different, at most one of them can be vertical. If one of them is vertical then, without loss of 
generality, we may assume it is the first. Therefore, b} = 0, a; 40, and we can rewrite the 


Geometric Constructions 637 


equation of the first line as x = ae Since the second line is not vertical, we know b2 #0, and 
plugging x = a into this equation gives us 
ajc2 — a2cy 
<_ ajb2 


Cy a\cz—a2C| 
je (can secre 
ay aby 


As a result, the coordinates of P,; are obtained from elements of the field K;_; by applying 
operations which always produce elements of the same field. Therefore, both coordinates 
of P, belong to K;-1, so K; = K;_; and so [K;: K;-;] = 1. 


Therefore, 


Next, suppose that neither line is vertical. Therefore, both b; and b2 are nonzero. Since the 
two lines have different slopes, it follows that — Fe - a which immediately implies that 
aznb, — a,b7 £0. As a result, when we solve the simultaneous equations a,x + bj y = c, and 
agx + boy = cp, we obtain 


bycz — boc, azC| — a\C2 


x= >= 5 
ab, —ayb2 arb, —ay,b> 


Therefore, 


Pp bycy—byc, age, — ajc2 
7 dnb, — abo’ ayb, —a,b2 f 
Using the same argument as in the previous paragraph, the coordinates of P; are obtained by 


applying operations to elements of K;_; that produce elements of K;_;. Once again, the 
coordinates of P, belong to K;-1, so K; = K;_; and [K;: K;-,] = 1. 


Case II—One Line and One Circle: By Lemma 16.5, P, is the intersection of a line with 
equation a,;x +b, y = c, and a circle with equation (x — az)* + (y — by)? = co, where 

a1, A, b1, bo, cy, Co € K;y_1. If the line is vertical, then b} = 0, ay 4 0, and we can rewrite its 
equation as x = a Plugging this into the equation of the circle, we obtain 


2 
c 
(2 -«) +(y—bo)? = 02. 
ay 


Since P; = (a, B;), the previous equation tells us that 8, satisfies a quadratic polynomial with 
coefficients in K;_;. As a result, 


[K;—1(B;) : K;-1] = 1 or 2. 


638 Chapter 16 
Since a, = a € K,_, C K;_1(B;), we have 


[Ky-1(@1, Br) : Kr-1(B)] = 1, 
which implies that 


[K;: K;-1] = [Kr-1(@r, By) : Kr—-1 (Br) ] - [Kr-1 (Br) : Kr-1] = 1 or 2. 


If the line is not vertical, then b; 4 0, and we can rewrite its equation as 


a\ cl 


ana ae 


Plugging this into the equation of the circle results in 


2 
The preceding equation is a quadratic polynomial in x with leading coefficient 1 + (7) ; 
Since P; = (a, B;), we see that a; satisfies a quadratic polynomial with coefficients in K,_,. 
As a result, 


[K;-1(@;) : K;-1] =1 or 2. 


Furthermore, since the point (q@,, B,) satisfies the equation y = — Bx + a. we have 


a{ Cl 
=-—— —€ Ky : 
B; bo hs 1-1 (Qt) 


It now follows that [K;—1(@;, B;) : Kz-1(@;)] = 1, hence 

[K, : Kr-1] = [Kr-1(@1, Br) : Kr—1 (tr) ] - [Kr—1 (tr) : Kr-1] = 1 or 2. 
Case III—Two Circles: By Lemma 16.5, P; = (a, 6;) is one of the intersection points of 
circles with equations (x — a,)* + (y— by)? = cy and (x — a2)? + (y— bo)? = co, where 


a1, 42, b,, bz, C1, Co € K;_-1. Therefore, the ordered pair (aq, B;) is also a solution of the 
following equation that is obtained by subtracting the equations of the two circles: 


((x— ay)? + (y—b1)?) — (x — a)? + (y—bp)”) = C1 — 02. 
This can be rewritten as 


(2a, — 2a)x + (ay? — ay”) + (2b, — 2b) y + (bi? — bo”) = 1 — ©, 


Geometric Constructions 639 


which simplifies to 
(2a — 2a,)x + (2by — 2b1)y = (C1 — €2) — (a1? — ay”) — (b1 — by’). 


Observe that this last equation is of the form ax + by = c, where a, b,c € K;_,. Therefore, the 
point P, = (@;, B;) is an intersection point of a line and a circle of the form we examined in 
Case II. Hence, the conclusions in Case II also apply in this case, and we once again have 


[K, : Kr+-1] = [Ki-1 (a1, By) : Ki-1] = 1 or 2. 
In all three cases, we have succeeded in showing that [K; : K;—1] is equal to 1 or 2, for 


2 <t <n. Therefore, when we look back at equation (1), every term that is being multiplied 
on the right-hand side of 


[Kn :Q)=([Kn : Kn—-1]-(Kn-1: Kn-2]---[K3 : K2]-[K2: Ki] 
is either 1 or 2. Therefore, [K, : Q] = 2”, for some integer m > 0. 


Finally, if @ is constructible, then a is one of the coordinates of the final point P,, of the 
sequence of points 


Po = (0,0), Pi = (1,0), Po,..., Ph =P 


introduced at the beginning of this proof. Since the field K, contains both coordinates of P,,, 
we know that a € K,,, and Theorem 14.6 asserts that 


2” = [Kn : Q) = [Kn : Q(a@)] -[Q(@) : QI. 
As a result, [Q(@) : Q] is a divisor of 2”, so [Q(a) : Q] = 2, for some integer / > 0. 


16.2 The Impossibility of Trisecting Angles 


We now have all the tools needed to prove that angles cannot be trisected. But first, we need to 
put this problem in perspective. If a is the distance between two constructible points, then 
basic construction #4 asserts that we can construct a circle with radius @ centered at the origin. 
Since the point (a, 0) is an intersection point of this circle and the x-axis, it follows that @ is 
constructible. Theorem 16.3(b) now indicates that 


aaa a 
Fg Gee ee 

are also constructible. This tells us that any constructible line segment can be bisected, 

trisected, or, more generally, divided into n equal pieces, for any n € N. 


It is then natural to turn our attention to angles. If an angle has been constructed, it is not hard to 
bisect it. To see this, if we are given an angle that has been constructed, let A denote its vertex. 


640 Chapter 16 


Basic construction #4 allows us to draw the circle with radius 1 centered at A. Let B and C 
denote the points where this circle intersects the two rays forming our angle. Next, draw the 
circles with radius 1 centered at B and at C and let D denote the intersection point, other than 
A, of these circles. Draw the line connecting A and D and observe that triangles ABD and 
ACD are congruent. Hence, angles BAD and CAD are also congruent. 


Since each of these angles is equal to half of our original angle, the line connecting A and D 
has bisected our original angle. 


Having shown that we can split a line segment into n equal parts, for any n € N, and then 
having shown that angles can be bisected, it is natural to try to trisect angles. Our work in 
Section 16.1 has enabled us to produce lists of numbers which are constructible as well as lists 
of numbers which are not constructible. For example, Theorem 16.3 tells us that every number 
on the following list is constructible: 


PD Af 3, 315 6, AT ND, V3, Al 65 A Tg A i ip A 5p NT 


On the other hand, every number of the list 


J2, 73, V5, 16, V7, 72, 73, V5, V6, V7, 72, 73, 15, 16, V7 


satisfies an irreducible polynomial in Q[x] of degree 3, 5, or 6. Thus, Theorem 16.6 tells us 
that every number of this list cannot be constructed with a ruler and compass. In fact, it will be 
Theorem 16.6 that is the key to the proof that angles cannot be trisected. 


Theorem 16.7. There are angles that cannot be trisected with ruler and compass. In 
particular, 60° angles cannot be trisected. 


Proof. We begin by drawing the circle with center Po = (0, 0) with point P; = (1, 0) on its 
circumference and then drawing the circle with center P, that has Po on its circumference. If 
we let A be the point in the first quadrant where these circles intersect, then we can see that 
triangle A Po P; is an equilateral triangle. Hence, 7A Po P; = 60°. 


Geometric Constructions 641 


(0, 0) (1, 0) 


Next, by way of contradiction, let us suppose there was a sequence of allowable moves that 
could trisect angles. Then, in particular, we could trisect 7A Pp P; to construct a 20° angle with 
vertex at Py and one ray starting at Po and continuing through P;. The other ray would begin 
at Py and would intersect the circle centered at Py with radius | at some constructible point B 
in the first quadrant. Therefore, 7 B Pp) P,; = 20°. By the definition of the cosine function, the x 
coordinate of B is cos(20°). 


(0, 0) (cos20°, 0) (1, 0) 


As a result, if a sequence of allowable moves enabled us to trisect angles, then the number 
cos(20°) would be constructible. Thanks to Theorem 16.6, this observation will soon lead 
to a contradiction. 


642 Chapter 16 


There are two ways to compute (cos(@) +isin(6))*. If we use DeMoivre’s Theorem, we obtain 
(cos(@) +i sin(9))° = cos(34) +isin(36). 


On the other hand, if we expand the expression using the distributive law and use the identity 
sin’ (0) + cos?(6) = 1, we obtain 


(cos(6) +i sin(9))* = (cos*(8) — 3. cos(8) sin?(6)) + i(3 cos”(6) sin(@) — sin?(6)) = 
(cos*(6) — 3 cos(6)(1 — cos”(8))) +i(3 cos”(6) sin(@) — sin?(6)) = 
(4cos*(@) — 3 cos(@)) + i(3 cos” (@) sin(@) — sin3(8)). 


Comparing the real parts of the preceding equations, we obtain the triple-angle formula 
cos(36) = 4.cos*(@) — 3.cos(6). 


Since cos(60°) = 5, if we let 0 = 20° and also let a = cos(20°), we can see that a satisfies 


1 
3? 
1 
— = 4a? — 3a. 
5 a a 
After multiplying this by 2 and subtracting 1, we can see that a is a root of 
8x° —6x—1 € Qi]. 


Since 8x* — 6x — 1 has degree 3, it will be irreducible in Q[x] if and only if it has no rational 
roots. Before applying the Rational Root Test, we introduce a shortcut. If we let y = 2x, then 


93? = 60 1S Oxy — 32x) a lay —3ye 7. 


It is very easy to use the Rational Root Test to see that y> — 3y — 1 has no rational roots as the 
only candidates for rational roots are +1 and they produce values of —3 and 1. Since y = 2x, 
any root of y? — 3y—1 is two times a root of 8x* — 6x — 1. Therefore, 8x? — 6x — 1 also has no 
rational roots, hence 8x* — 6x — 1 is irreducible in Q[x]. 


We have just seen that cos(20°) is a root of the irreducible cubic 8x* — 6x — 1 € Q[x]. 
Therefore, 


[Q(cos(20°)) : Q] = 3. 
However, Theorem 16.6 asserts that if cos(20°) is constructible, then 
[Q(cos(20°)) : Q] = 2, 


for some integer / > 0. This is a contradiction, as 3 4 2! for any integer /. As a result, 20° 
angles cannot be constructed, so angles cannot be trisected with a ruler and compass. 


Geometric Constructions 643 


We conclude this chapter with a brief discussion of two other famous constructibility 
problems: the “Doubling of the Cube” and the “Squaring of the Circle.” A cube with volume 1 
has sides of length 1. The Doubling of the Cube problem asks whether it is possible to 
construct a line segment whose length would produce a cube whose volume was 2. Certainly, 
a cube of volume 2 would have sides of length 23. However, 23 is a root of the polynomial 

x° —2, which is irreducible over Q. Therefore, 


[Q23) :Q] =3, 


so Theorem 16.6 tells us that 23 is not constructible. Therefore, the cube cannot be doubled 
with a ruler and compass. 


The area of a circle of radius 1 is 7. The Squaring of the Circle problem asks whether it is 
possible to construct a line segment whose length would produce a square whose area is z. 
Once again, Theorem 16.6 plays a fundamental role. However, parts of this problem require 
ideas not covered in this course. If a square has area equal to zr, then the length of its sides 
must be ./z. Therefore, if we could square the circle, then Theorem 16.6 would imply that 
[Q(./z) : Q] was equal to 2”, for some integer n > 0. 


It turns out that z is transcendental. This is another way of saying that z is not algebraic, 

so it is not a root of any nonconstant polynomial with rational coefficients. However, the 
proof that z is transcendental is quite long and would take us very far afield. But once you 
know that z is transcendental, it is easy to show that ./7 is also transcendental. Thus, at this 
point, Theorem 16.6 would tell you that we cannot square the circle with ruler and 
compass. 


Exercises for Sections 16.1 and 16.2 


1. Ifa, 6 € R such that a is constructible and £ is not constructible, prove that w+ £ is not 
constructible. 


2. Ifa, Be R such that @ is a nonzero constructible number and £ is not constructible, 
prove that a - 6 is not constructible. 


3. Ifn° and m° angles can be constructed, explain why (n+ m)°, (n—m)°, and (3)° angles 
can also be constructed. 


4. Show that 36° angles can be constructed. You may want to look at exercise 18 following 
Section 11.1. 


5. Show that 3° angles can be constructed. 


6. Ifn €N, show that n° angles can be constructed if and only if n is a multiple of 3. 


644 Chapter 16 


In exercises 7-18, determine whether the given number is constructible and explain your 


answer. Exercises | and 2 may be useful is solving these problems. 


ds 
8. 
9. 
10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 
19. 
20. 


173 
23% 


27-197 


If D is the field of constructible numbers, show that 


If D is the field of constructible numbers, show that [ 


) is not a finite extension of Q. 


D(a): 


)] 2, for everyaeR. 


Insolvability of the Quintic 


In our final chapter, we pull together much of the mathematical machinery developed in the 
first 16 chapters to give a proof of Galois’s classic result on the insolvability of the quintic. In 
Chapter 10, we saw that all polynomials of degree 1, 2,3, 4 can be solved by radicals. This 
means that we can find the roots of these polynomials by adding, subtracting, multiplying, 
dividing, and taking nth roots, for various n € N, of combinations of the coefficients. 


Back in Chapter 1, we wondered how we go about proving that something cannot be done. 
More precisely, how does one prove that polynomials of degree greater than or equal to 5 
cannot be solved by radicals? Applying the tools we have developed, we do this in three steps: 


I. Formally define, in terms of field extensions, what it means for a polynomial to be 
solvable by radicals. 


II. Prove that the Galois group of the splitting field of a polynomial that is solvable by 
radicals is a solvable group. 


Ill. Exhibit polynomials such that the Galois groups of their splitting fields are not solvable 
groups. 


17.1 Radical Extensions and Their Galois Groups 


The first step in proving the insolvability of the quintic is to formally define, in terms of field 
extensions, exactly what it means for a polynomial to be solvable by radicals. To see what the 
definition should be, we will look at several examples and then examine what they have in 
common. Throughout this chapter, for m € N, we will let a, = cis(= ). 


m 


m Examples 


1. Let f(x) =x? — 11 € Q[x]. The roots of f(x) are +/11 and they both belong to the 
field Li = Q(V11). 


2. Let g(x) = x =—5e Q[x]. The roots of g(x) are 
V5, os/5, w3? 5 
and they all belong to the field Lz = Q(a3, /5). 


Copyright © 2010 by Elsevier Inc. All rights reserved. 645 


646 Chapter 17 


3. Let h(x) = x® — 15x* — 8x3 + 75x? — 120x — 109 € Q[x]. The roots of h(x) are 
+V5 4+ w;' V4, 
for 0 <i <2, and they all belong to the field L3 = Q(a6, ig 5, V/A). 
4. Let B=/6—20(¥/7); B is the root of some polynomial in Q[x] and B belongs to the 
field L4 = @( ono an Yo-20097)). 


5. Let y= 14/28 + V 307 + 4/13; y is the root of some polynomial in Q[x] and y 
belongs to the field 


Ls = (020.4 8 4/7, 113, (3744 : 3). 


Observe that another way to look at the fields that arose in our five examples are 
1. L,=Q(ay), where a)? € Q; 
2. Lo = Q(w3, bi), where bi> € Qa); 
3. L3 = Q(a6, c1, cz), where c,” € Q(ws) and cz? € Q(w6, ¢1); 
4. L4 = Q(w29, di, da), where di? € Q(w29) and dy* € Q(w29, di); 
a 


Ls = Q(a52, €1, €2, €3, €4), Where €17 € O(w52), €27 € Q(w252, €1), 
3’ € Q(ar52, €1, €2), and e4° € Q(ar52, €1, 2, €3). 


In fact, if we want L; to look more like Lz, L3, L4, and Ls, we can express L, as 
L, =Q(@2,a1), where a;* € Q(w2), 
since a2 =—1eEQ. 
In all five examples, the roots of our polynomials belong to fields of the form 
Q(@m, O1, +++, O@n=1, An), 


where a! € Q(@m), a2” € Q(@m, @1), 03" € QO(@m, 01, 02), ..., An" € Q(Om, 1,..., 
Qn—1), and m = my,-m7-+-My. 


It certainly appears that fields of the type described above will play an important role in the 
proof of the insolvability of the quintic. This motivates 


www-.elsevierdirect.com 


Insolvability of the Quintic 647 


Definition 17.1. A field L is called a radical extension, if there exist a; € C and m; € N such 
that 


(a) L=Q(@m, 1, ..., n—1, An), 

(b) On = cis(*), where m =m ,-M2---Mp, 

(c) ay”! € Q(@m), a2”? € Q(@m, 01), 83"? € Q(@m, 1, 2), .. 5 An” € Q(@m, 
Osea Oned), 


Expressions that arose in our examples, such as 


16-2077) and 14/28 + ¥3V7 + 4/13, 


are exactly the types of elements of C we would expect to see as roots of polynomials in which 
the roots are obtained by adding, subtracting, multiplying, dividing, and taking nth roots, for 
n €N, of combinations of the coefficients. Since expressions like the preceding ones belong to 
radical extensions, it seems reasonable to say that if aw € C is the root of some f(x) € Q[x] 
that is solvable by radicals, then w belongs to some radical extension. Therefore, it appears that 
the proper definition of what it means for f(x) € Q[x] to be solvable by radicals is that all of 
its roots belong to some radical extension. We record this as 


Definition 17.2. A polynomial f(x) € Q[x] is solvable by radicals if there exists a radical 
extension L such that L > K D Q, where K is the splitting field of f(x) over Q. 


Having succeeded in giving a formal definition, in terms of field extensions, what it means 
for a polynomial to be solvable by radicals, we now need to examine properties of radical 
extensions. If L is a radical extension, then L = Q(@, @1,..., @n—1, @n) and repeated 
applications of Theorem 15.6 tell us that 


[L : Q] = [Q(@m, 01, -- +, n=1, En) ? Q@m, 1, +++, An-1)] ++ 
[Q(a@m, 01, 42) : Q@m, 01) -[Q@m, 1) + Mom)] -[Q@m) : QI. 
Since oj" € Q(@m, 1, ..., @—1), for 2 < i <n, we know that 
[Q(@m, 01, .-+, @i—1, &) : Q(@m, 1, ..., @j—1)] < mj. 
In addition 
[Q(@m, 1): Q(@m)] <m1 and [Ql@m):Q] <m. 
Combining these facts, it follows that 


[L : Q] <Mn-Mp—1+++m2-m,-m =m”. 


648 Chapter 17 


As a result, all radical extensions are finite extensions of Q. Thus, every element of a radical 
extension is algebraic over Q. This is the reason why, in examples (4) and (5), we could say that 


V6—-20(/7) and 149/28 + ¥3V7 + 4V/13, 


were roots of polynomials in Q[x], even though we did not explicitly mention polynomials 
that had them as roots. 


One thing that might strike you as odd about radical extensions is the inclusion of 

On = cis(22). There are two reasons for including @,,. The first, which is certainly not 
apparent at this time, is that it will simplify the computation of Galois groups of radical 
extensions. 


The second reason concerns expressions of the form ,/6 — 20(./7). This expression represents 


an element of C that, when raised to the fourth power, is equal to 6 — 20(./7). Since there are 
no real numbers whose fourth power is 6 — 20(\/7), there is some ambiguity as to which 


element of C is meant by the expression ,/6 — 20(./7). 


DeMoivre’s Theorem asserts that there are four different elements of C that, when raised to the 
fourth power, give us 6 — 20(0/7 ). Furthermore, if 6 is one of them, then the other three are 
wa, wa’ B, and w,? B. However, since our radical extension contains w4, once it contains one 
element of C whose fourth power is 6 — 20(</7), it automatically contains all four. Therefore, 
any ambiguity about which fourth root of 6 — 20(x/7) is contained in a radical extension is no 
longer a concern. 


As a result, by letting a, = cis(), where m = m,-m ---My, Our radical extension contains 


all m; of the m;th roots of 1, for 1 < i <n. Therefore, whenever our radical extension contains 
one m;th root of an element of C, it will automatically contain all m; of them. 


It is now time to examine the Galois groups of radical extensions. To simplify the notation 
used with radical extensions, if 


L = Q(@m, 1, +--+; Qn—1; An) 
is a radical extension, let 

Li = Q(@m, 1, --., @i-1, Mi), 
for 1 <i<n, and let Lp = Q(@m). 
We now have the chain of fields 


beh, Dig4o tS lp 14 GS O. 


We would like to use this chain of fields to produce a chain of groups. To do this, we need 


Insolvability of the Quintic 649 
Lemma 17.3. If F C K C L are fields, then Gal(L/K) is a subgroup of Gal(L/F ). 


Proof. Every element of Gal(L/K) is an automorphism of the field Z that is the identity map 
on the field K. Since K contains F,, every element of Gal(L/K) is certainly the identity map 
on the smaller field F, so Gal(L/K) is a subset of Gal(L/F ). Since Gal(L/K) is already a 
group, we know that if g,h € Gal(L/K), then gh! € Gal(L /K). Proposition 8.6(a) now tells 
us that Gal(L/K) is a subgroup of Gal(L/F). 


m Example 


We revisit an example from Chapter 15. Let Q(V3, i); then Gal(Q(V3, i)/Q) isa group of 
order 4. Since Gal(Q(V/3, i) /Q) consists of automorphisms that are the identity on Q, its 
elements can be completely described in terms of their behavior on /3 and i as follows: 


gi(V3) = V3, si) =i, go(V3) = —V3, goli) =i, 
23(V3) = V3, 93(i)=—i,  g4(V3) = —V3, gai) = —i. 


Since Q(i), Q(V3), and Q(iV3) are all fields containing Q and contained in Q(73, i), 


Lemma 17.3 tells us that 
Gal(Q(V3, i)/QG)), Gal(Q(V3, i)/Q’V3)), and Gal(Q(V3, i)/QV3)) 


are all subgroups of Gal(Q(V3, i)/Q). 


It is now not hard to see that 
Gal(Q(V3, i)/Q@) = {81, 82} C Gal(Q(V3, i)/Q, 
Gal(Q(V3, i)/QW3)) = {g1, 83} S Gal(Q(V3, i)/Q, 
Gal(Q(V3, i)/QGV3)) = {g1, 84} © Gal(Q(V3, i)/Q. 


Using the notation for the chain of fields that we introduced before Lemma 17.3, if we let 
G = Gal(L/Q), then Lemma 17.3 tells us that Gal(L/L;) is a subgroup of G, for 0 <i <n. 
In fact, we obtain the chain of subgroups 


G = Gal(L/Q) 2 Gal(L/Lo) 2 Gal(L/L) 2 --- D> Gal(L/Ly-1) 2 
Gal(L/Ln) = {e}. 


www.elsevierdirect.com 


650 Chapter 17 


Therefore, if we wish to prove that Gal(L/Q) is a solvable group, it suffices to show that 

this chain of subgroups satisfies the condition in Proposition 8.33. Parts of the proof of 
Proposition 17.4 should look familiar as many of the ideas in the proof have already appeared 
in the proofs of Corollaries 15.19, 15.20, and 15.23. 


Proposition 17.4. If L is a radical extension, then Gal(L/Q) is a solvable group. 


Proof. Since L is a radical extension, there exist a; € C and m; € N such that 


L _ Qn, 1, eee. On—1, An), 
where a1"! € Q(@m), &2""? € Q(@m, a1), 3"? € Q(@m, O11, 2), 6. Un" € 
Q(@m,1,..-,@n—1), and m = m,-m2---mMp. 
If we let L; = Q(@m, @1,..., Aj—1, @;), for 1 <i <n, and let Lp = Q(@,,), then we have the 


chain of fields 
LSt. > ls GSD 5 Shi DiS 0, 


Next, if we let G = Gal(L/Q), then Lemma 17.6 turns the chain of fields into the following 
chain of subgroups: 


G = Gal(L/Q) 2 Gal(L/Lo) > Gal(L/L}) 2 --- > Gal(L/Ly-1) > 
Gal(L/Ln) = {e}. 


According to Proposition 8.33, in order to show that Gal(L/Q) is solvable, it now suffices to 
show that 


(a) ghg—'h—! € Gal(L/Lo), for all g, h € Gal(L/Q), and 
(b) ghg'h7! € Gal(L/L;j41), for all g,h € Gal(L/L;), where 0 <i<n-—1. 


For part (a), let g,h € Gal(L/Q); since Lp = Q(w»), in order to show that ghg~'h7! € 
Gal(L/Lo), we need to show that ghg—'h7! is the identity on w,,. The polynomial x” — 1 has 
coefficients in Q, therefore Corollary 5.13 tells us g~! and A~! must both send roots of x” — 1 
to roots of x” — 1. Since every root of x” — 1 is a power of w,,, if we simplify the notation by 
letting w take the place of w,,, there exist i, j € N such that 


g'(@)=o! and h'(@) =a". 
Therefore, g(a’) = w and h(w!/ ) =a, and since g, h, gt, h-! are all automorphisms, we have 
(ghg 'h7')(w) = (ghg')(h7'(@)) = (ghg')(w!) = 
(gh)(g~'(w!)) = (gh)((g~'(@))) = (gh) ((o')/) = (gh) (o") = 
g(h(w")) = g((h(w!))') = go) =a. 


Insolvability of the Quintic 651 


Therefore, ghg~'h~!(@m) = @m and it is indeed the case that if g,h € Gal(L/Q), then 
ghg'h-! € Gal(L/Lo). 


For part (b), if 0<i<n-—1 and g,h € Gal(L/L;), we need to show that ghg-'h-! E 
Gal(L/Lj+1). Recall that Lj; = Lj(aj41), where 


aja =a, 


for some a € L;. Since a@j+, is a root of x+! —a € L;[x], both g—! and h—! must send aj41 to 
another root of x’"'+! — a. However, every root of x’”'+! — a is equal to a;, times a root of 
x’'+! — ], On the other hand, every root of x’”"'+! — 1 is a power of @m. If we once again 
simplify the notation by letting w = w,, there exists j, k € N such that 


aif Ai k 
8 (i441) =@!aj4, and Ah” (aj41) =O aj41. 


Since g and / are the identity on L;, they are the identity on all powers of w. Combining this 
with the previous equation and the fact that g, h, a h-' are all automorphisms, we have 


(ghg'h™')(ai41) = (ghg')(h7! (ai41)) = (ghg~')(@*ai41) = 
(gh)(g7 | (w*ai41)) = (gh)(g7 | (@*)g~ (ai41)) = (gh) (og (ai41)) = 
(gh) (ww! ass) = (gh) (ol ai41) = g(h(o!*ai41)) = 


g(h(w!)h(wkai41)) = g(wlorin1) = 41. 


Since ghg'h7! has been shown to be the identity on L; and a;+1, it follows that it is the 
identity on Lj41; = L;(a@;41). Thus, ghg~'h~! is an automorphism of L which is the identity on 
Li+1, 80 ghg-'h7! does indeed belong to Gal(L/L;+1), as required. 


Proving that Galois groups of radical extensions are solvable is a huge step towards the proof 
of the insolvability of the quintic. If f(x) € Q[x] is solvable by radicals, we would like to show 
that Gal(K/Q) is solvable, where K is the splitting field of f(x). We know that there exists a 
radical extension L such that L > K D Q, and Proposition 17.4 tells us that Gal(L/Q) is 
solvable. But this does not yet tell us that Gal(K/Q) is solvable. Although Corollary 8.40 is 
not stated in these exact terms, if you look back at the proof of this result, you will see that the 
proof tells us that subgroups as well as images under homomorphisms of solvable groups are 
solvable. However, Gal(K/Q) has not been shown to be either a subgroup or the image under 
a homomorphism of Gal(L/Q). 


On the other hand, if L could be replaced by a radical extension M of Q that not only 
contained K but was also a splitting field over Q, then we would eventually be able to show 
that Gal(K/Q) is the image under a homomorphism of the solvable group Gal(M/Q). The 
next two lemmas are rather long and technical, but they are exactly what we need to show that 
there exists a radical extension M with the desired properties. 


652 Chapter 17 


Lemma 17.5. Jf L; = Q(f}), L2 = Q(f2),..., Lr = Q(B;) are radical extensions, then there 
exists m € N such that Q(@m, Bi, ..., B,) is a radical extension. 


Proof. Since L; is aradical extension, for 1 <i < t, we have 


Li = Q(Bi) = Qn; Hi,1, 07,25 «++, Lin;)- 


where aij C, aj! € Q(@m;); 0g, € Q(@m;; Ojls.-+5 Qj j—-1), for 2 < j <n;, and 
Mj, =Mj1°Mj,2°°*Mjin,- 


We now make two simple observations. The first is that if F C K are fields and y € C has the 
property that y* € F, then it is clear that y* € K. Next, ifn, a, b € N such that n = a-b, then 


*=(08(7)) =6(S*) =F) 
@y" = | cis{ — = cis =cis{ — ] = ap. 
n n b 


Thus, wp € Q(an) and Q(wp) S Q(@n). 


In light of these observations, consider the field 


L = Q(@m, O11, oe e591 nj, 21, +++5Q2 nos vee Mp1, 155 Qn, )s 
where m = m,-m2---m;. 


We now need to show that L is a radical extension. Since m is a multiple of each mj, ;, where 
1<i<tand1 <j <n;, we know that Qm, ;) C Q(@m). As a result, a1"! € Q(@m) 

and @;, ;’"'/ belongs to the field generated over Q by all the elements to the left of a; ; in the 
definition of L. For example, 


03,2"? € Q@mns, 43,1) F QOm, O11, +++ Oy» H2,15 +++ L2,np5 03,1). 
Since m is the product of all the m;,;, we see that L is indeed a radical extension. 
Next, since 6; € Q(@m,, Oi,1, Hi,2,---, Gin) © L, for all i, it follows that 
Qn, Bi, a) B;) S L. 


On the other hand, since each a;,; € Q(;), we can also see that every generator of L over Q 
belongs to Q(@m, 61, ..., Br). Therefore, 


L © Qn, Bi, ..+5 By). 


As aresult, L = Q(@m, 61, ..., By), hence Q(@m, B1,..., 6) is a radical extension. 


We can now prove the extremely useful 


Insolvability of the Quintic 653 


Lemma 17.6. Jf f(x) € Q[x] is solvable by radicals, then there exists a radical extension M 
such that M > K 2 Q, where M is the splitting field of some q(x) € Q(x] over Q and K is 
the splitting field of f(x) over Q. 


Proof. Since f(x) is solvable by radicals, K is contained in a radical extension L = Q(@n, 
Q1,...,@n—1, @,) that satisfies the properties in Definition 17.1. In light of Theorem 15.12, 
there exists 6 € C such that L = Q(B). Next, let h(x) € Q[x] be the minimum polynomial for 
B over C, and also let t denote the degree of h(x). If y € C is another root of h(x), then 
Lemma 17.6 asserts that the map 


mt: Q(B) > QQ) 
defined as 


m(ay+a1B+---+a;-1B" ')=ap+airy+-:-+aay"", 
for all a; € Q, is a bijective homomorphism of fields that is the identity on Q. We need to show 
that Q(y) is also a radical extension. 


To this end, since z is a bijective homomorphism which is the identity on Q, we have 


Q(y) = Q(x(B)) = t(Q(B)) = M(Q(@m, O1, «--, Ln—-1, An) = 
Q(t(@m), 1(01), «. +, H(An—1), W(An)). 


The terms of the form @,,/, for 0 < j <m-—1, are the m distinct roots of x” — 1. Since z is an 
injective homomorphism, observe that the terms of the form (@,,)/ are also the m distinct 
roots of x” — 1. As a result, 2(@,,) is a power of @,,, and, conversely, @,, is also a power of 
It(@»). Therefore, a field extension of Q contains w,, if and only if it contains 2(@,,). In light 
of this and our preceding work, we now have 


Q(Y) = Q(@n, Wa), «5 H(An—1), W(An)). 


By repeatedly using the fact that z is a bijective homomorphism that is the identity on Q, 
we obtain (a1)! € Q(@m), m(a2)""? € Q(@m, (1), (03)? € Q(@m, W(1), 1(A2)), .--, 
(An) € Q(@m, (21), ..., H(Ay_1)). Thus, Q(y) satisfies all the properties in 

Definition 17.1 and is indeed a radical extension. 


If we let 6,,..., 6; € C be the roots of h(x), our previous argument shows that each Q(;) 
is a radical extension. Therefore, Lemma 17.5 tells us that there exists 7 € N such that 
Q(a@;, 61, ..., B;) iS also a radical extension. If we let M = Q(q, 61, ..., 8), we observe that 


M also contains the / roots of x’ — 1. As a result, M is the splitting field over Q of the 
polynomial (x! — 1)h(x) € Q[x]. Thus, in addition to containing both L and K, M is aradical 
extension that is the splitting field over Q of g(x) = (x! — 1)h(x) € Q[x], as required. 


654 Chapter 17 


The pieces are now in place to prove the main result of this section. The idea behind the proof 
will be to compare the Galois groups Gal(M/Q) and Gal(K/Q), where M > K > Qand K 
and M are both splitting fields over Q of elements of Q[x]. In this situation, we will show that 
Gal(K/@) is the image under a group homomorphism of Gal(M/Q). The proof is one of the 
more abstract ones in this book, so we will first look at two examples that illustrate this 
situation. 


m Examples 


1. Let M = Q(x/2, 3) and K = Q(a3), where w3 = cis(4). Then we have M D K DQ, 
where K is the splitting field over Q of x3 — 1 and M is the splitting field over Q of 
x —2. Corollary 15.21 indicated that Gal(M/Q) is a nonabelian group of order 6, 
so it must be isomorphic to $3. We will use the same notation we used in Section 8.1 
when we provided a multiplication table for $3. Recall that since the elements of 
Gal(M/Q) are the identity on Q, they are completely determined by their behavior on 
J2 and w3. 


e(V2)=V2 and e(a3)=03;  f(W2)=V2 and e(w3) = 037; 
g(V2)=V2-03 and g(@3)=037;  h(W/2)=V2-037 and h(w3) = 03°; 
i(V2) =V2-03? and j(w3)=03;  k(W/2)=V2-03 and k(@3) = 03; 
Observe that if x € Gal(M/Q), then 
x(3) € {@3, 3°} C Q(w3) = K. 


However, since K is a splitting field, this is no surprise. Indeed, Lemma 15.15 
guarantees that x(K) = K, for any x € Gal(M/(Q). Since the automorphisms in 
Gal(M/) send elements of K to K, we can consider the function 


ob: Gal(M/Q) > Gal(K/Q) 


defined as @(x) = X, where X denotes the restriction of x to the smaller field K. Recall 
that Corollary 15.24 guarantees that ¢ is a group homomorphism. We now have 


2(@3) = j(@3) = k(@3) = 03 


and 


F(@3) = B(@3) = h(@3) = @3°. 


The preceding equations show that the image of ¢ contains both elements of 
Gal(K/Q). In addition, we can see the images under ¢ of the three elements e, j,k 
are the identity element in Gal(K/Q). Therefore, the set {e, j,k} is the kernel of ¢. 


www-.elsevierdirect.com 


Insolvability of the Quintic 655 


Another way to look at this is that the kernel of @ consists of the elements of 
Gal(M/Q) which are the identity of K, so the kernel of ¢ is equal to Gal(M/K). 
Putting all these pieces together, Theorem 8.39(a)—lsomorphism Theorem for 
Groups tells us that 


Gal(M/Q)/Gal(M/K) © Gal(K/Q). 
2. Let M=Q(V2, V3) and K = Q(/2); then M D> K DQ, where K is the splitting field 
over Q of x? —2 and M is the splitting field over Q of (x — 2)(x? — 3). It is easy to see 
that Gal(M/Q) is a group of order 4 that contains no element of order 4, so 


Gal(M/) is isomorphic to Cz x C2. We can now represent the four elements of 
Gal(M/Q) by looking at their behavior on V2 and V3. 


e(V/2)=V2 and e(V3)=V3; (V2) = V2 and f(V3) = —V3; 
e(V/2)=—V2 and 9(V3) = V3; h(V/2) =—V2 and h(—V3) = —V3. 
Observe that if x € Gal(M/Q), then 
x(W/2) € (V2, -V2} C Q(V2) = K. 


As in the previous example, this is no surprise as Lemma 15.15 guarantees that 
automorphisms in Gal(M/Q) send elements of K to K. As before, we can define the 
function 


¢ : Gal(M/Q) > Gal(K/Q) 


defined as @(x) = X, where X denotes the restriction of x to the smaller field K. 
Corollary 15.24 asserts that ¢ is a group homomorphism, and we can observe that 


a2) = f(V2) = V2 
and 
2(V2) = h(V2) = —V2. 


We can see that the image of ¢ contains both elements of Gal(K/Q) and the kernel 
of ¢ consists of the set {e, f}. Note that the kernel of @ consists of the same elements 
as Gal(M/K), so Theorem 8.39(a)—Isomorphism Theorem for Groups tells us that 


Gal(M/Q)/Gal(M/K) © Gal(K/Q). 


We can now state and prove Galois’s beautiful result on the insolvability of polynomials by 
radicals. 


www.elsevierdirect.com 


656 Chapter 17 


Theorem 17.7. Jf f(x) € Q[x] is solvable by radicals and K is the splitting field of f(x) over 
Q, then Gal(K/Q) is a solvable group. 


Proof. Definition 17.2 and Lemma 17.6 tell us that there exists a radical extension M such that 
M > K D> Qand M is the splitting field over Q of some g(x) € Q[x]. Observe that since M is 
the smallest field that contains Q and all the roots of g(x), it is also the smallest field that 
contains K and the roots of g(x). Therefore, M is also the splitting field of g(x) over K. 


When we combine the preceding observation with the fact that K and M are splitting fields 
over Q, Theorem 15.17 tells us that 


(1) |Gal(M/K)|=[M: K], |Gal(K/Q|=[K:Q], and |Gal(M/Q)|=[M: QI. 


Since K is a splitting field, Lemma 15.15 asserts that for every g € Gal(M/Q), we have 
g(K) = K. As a result, we can define the function 


@ : Gal(M/Q) > Gal(K/Q) 


as 


b(8) =8, 
where @ is the restriction of g to the smaller field K. 


Corollary 15.24 tells us that @ is ahomomorphism and Gal(M/K) is the kernel of ¢. Applying 
Theorem 8.39(a)—Isomorphism Theorem for Groups, we now have 


Gal(M/Q)/Gal(M/K) © Im(@). 


Our goal is to show that Gal(K/Q) is solvable. Proposition 17.4 told us that Gal(M/Q) is 
solvable, and the proof of Corollary 8.40 indicates that the image under a homomorphism of a 
solvable group is solvable. As a result, we do know that /m(@) is a solvable group. Therefore, 
we will be done if we can show that /m(@) is all of Gal(K/Q). Theorem 15.6 tells us that 


[M : Q]=[/: Kk] -[K:Q] 


and Lagrange’s Theorem tells us that 


IGal(M/Q)| 


[Im (9)|= \Gal(M/R)|" 


When we combine this with equation (1), we have 


IGal(M/Q| _ [M: Ql] 


im P= EM/K)| > [M2] 


= [K : Q) =|Gal(K/Q). 


Insolvability of the Quintic 657 


Since Im(¢) C Gal(K/Q) and |Jm(¢)| = |Gal(K/Q)|, it now follows that Jm(@) = 
Gal(K/Q). Having already shown that Jm(@) is solvable, we know now that Gal(K/Q) is 
solvable, as required. 


17.2 A Proof of the Insolvability of the Quintic 


At the beginning of this chapter, we described the three steps needed to prove the insolvability 
of the quintic. In Section 17.1, we worked through the first two steps. For the final step, we 
now present a technique for producing polynomials which are not solvable by radicals. 


Theorem 17.8. Suppose f(x) € Q[x] has degree p, is irreducible in Q|x], and has exactly 
p—2 real roots, where p is prime. If we let K be the splitting field of f(x) over Q, then 
Gal(K/Q) is the symmetric group Sp. 


Proof. Since f(x) is irreducible over Q[x], the Fundamental Theorem of Algebra and 
Corollary 10.24 assert that f(x) has p distinct roots in C, which we will call a), a2, ..., ap. 
Corollary 5.13 tells us that every element of Gal(L/K) sends roots of f(x) to roots of f(x). 

In addition, since K = Q(@, @2,...,@p,), every element of Gal(K/Q) is a different bijection 
of the set of roots of f(x). Therefore, Gal(K/Q) is a subgroup of S,,, and we need to show that 
Gal(K/Q) is all of S>. 


Since f(x) has exactly p—2 real roots, we can order the p roots so that a; and a2 do not 
belong to R and the other p — 2 roots do belong to R. Lemma 15.15 asserts that since K is a 
splitting field, complex conjugation restricts to an automorphism g of K. Therefore, g belongs 
to Gal(K/Q), and we have 


8(01) =A, gl(a2)=a), and g(a;) =a, 
for 3 <i < p. We can see that g is the transposition (12) € Sp. 
Since f(x) has degree p, [Q(@1) : Q] = p. However, Theorem 15.6 tells us that 
[K: Q) =[K : Q(@1)] [Q(a1) : QI, 


so [K : Q] is a multiple of p. On the other hand, since K is a splitting field, Theorem 15.17 
asserts that 


IGal(K/Q)|= [K : QI. 


As a result, |Gal(K/Q)| is a multiple of p, therefore Sylow’s Theorem tells us that Gal(K/Q) 
contains a subgroup of order p, and it immediately follows that Gal(K/Q) contains an 
element of order p. 


658 Chapter 17 


At this point, we have succeeded in showing that Gal(K/Q) is a subgroup of S, that contains 
both a transposition and an element of order p. Theorem 8.51 now tells us that Gal(K/Q) is 
indeed equal to all of S,. 


It has been a very long and interesting journey. We have been introduced to many algebraic 
objects and concepts such as automorphisms, solvable groups, symmetric groups, Sylow’s 
Theorem, irreducibility criteria, splitting fields, and Galois groups. By applying these and 
various other algebraic ideas, we have achieved one of the primary goals of this course, the 
ability to produce fifth-degree polynomials that are insolvable by radicals. 


Corollary 17.9. Jf p is a prime and n > 2 is an integer, then the polynomial x° — npx + p is 
not solvable by radicals. 


Proof. We will begin by showing that f(x) = x° —npx-+ p satisfies the conditions of 
Theorem 17.8. First, using the prime p, Eisenstein’s Criterion tells us that f(x) is irreducible 


over Q. Next, the derivative of f(x) is 5x*—np and the only real roots of f’(x) are +,/ ae 


Since f’(x) has only two real roots, Rolle’s Theorem asserts that f(x) has at most three 
real roots. 


We would now like to show that f(x) changes sign three times, for that would allow us to use 
the Intermediate Value Theorem to assert that f(x) has at least three real roots. When x is a 
negative real number with a large absolute value, f(x) < 0. However, f(0) = p > 0, so there 
exists at least one negative root. Next, observe that 


fd) =1-np+p=1-pm-1) <0, 
so f(x) also has a root between 0 and 1. Finally, for large values of x, f(x) is again positive, so 


there is a third sign change and a third real root occurring when x > 1. 


As aresult, f(x) is irreducible over Q, has degree 5, and has exactly 3 real roots. We can 
now apply Theorem 17.8, and it tells us that if K is the splitting field of x° —npx + p over Q, 
then Gal(K/Q) = Ss. Theorem 8.53 asserts that S5 is not solvable, and it now follows from 
Theorem 17.7 that x° —npx+ p is not solvable by radicals. 


By plugging values of n and p into x° —npx-+ p, we can easily produce examples of 
fifth-degree polynomials that are not solvable by radicals. 


Examples 


In light of Corollary 17.9, the following quintics are insolvable by radicals: 


= 4e42. 2 6x42, 2 = 8x42, = 10" 2, = 12e4+ 


i) 


x —6x+3, 2° —9x4+3, 2° —12x+3, 2 —15x+3, 2 —18x+43, 


Insolvability of the Quintic 659 


x —10x+5, 2° —15x+5, x —20x+5, 2 —25x+5, x —30x+4+5, 


alae, Dl, F287, aH. SK 


We used Theorem 17.8 to prove Corollary 17.9 and this corollary allowed us to produce 

an infinite number of polynomials of degree five not solvable by radicals. However, we do not 
need to stop with quintics. Theorem 17.8 can also be used to produce polynomials of higher 
degree not solvable by radicals. In fact, our next corollary allows us to produce an infinite 
number of seventh-degree polynomials that are insolvable by radicals. 


Corollary 17.10. Jf p is a prime then the polynomial x! —5px° +7 px + p is not solvable by 
radicals. 


Proof. If we let g(x) = x’ —5px° +7 px + p, then Eisenstein’s Criterion with the prime p 
implies that g(x) is irreducible over Q. Observe that 


go" (x) = 42x — 100 px? = x7 (42x? — 100p). 


Therefore, g” (x) has only three distinct real roots, so Rolle’s Theorem tells us that g’(x) has at 
most four distinct real roots. Another application of Rolle’s Theorem now tells us that g(x) has 
at most five distinct real roots. However, since g(x) is irreducible, Corollary 10.24 tells us that 
all of its roots are distinct. Therefore, g(x) has at most five real roots. 


On the other hand, consider the following values of g(x): 


g(—3) = —21874+1195p>0, g(-l)=—-l1-—p<0)O, 
g(0)=p>0, g(2)=128—145p <0. 


Certainly, if x is negative with a large absolute value, we know that g(x) < 0. Therefore the 
Intermediate Value Theorem asserts that there is a real root when x < —3. In addition, we can 
see that g(x) changes sign between —3 and —1, between —1 and 0, and between 0 and 2. 
Therefore, the Intermediate Value Theorem says that there are at least three more real roots 
between —3 and 2. In addition, when x is large, we know that g(x) > 0. Therefore, there is yet 
another sign change and at least one more real root when x > 2. As a result, g(x) has at least 
five real roots, and the preceding paragraph showed that g(x) had at most five real roots. Thus, 
g(x) has exactly five real roots. 


The preceding arguments show that g(x) is irreducible over Q, has degree 7, and has exactly 
5 real roots. Theorem 17.8 now asserts that if K is the splitting field of x’ —5px° +7px+ p 
over Q, then Gal(K/Q) = S7. Theorem 8.53 asserts that 57 is not solvable, and it now follows 
from Theorem 17.7 that x’ — 5 px° +7px-+ p is not solvable by radicals. 


660 Chapter 17 


By plugging primes into x’ —5 px? +7px-+ p, we can now produce an infinite number of 
polynomials of degree seven that are insolvable by radicals. 


m Examples 


It follows from Corollary 17.10 that the following seventh-degree polynomials are 
insolvable by radicals: 


x’ —10x> + 14x+2, x? —15x> + 21x+3, x) —25x° + 35x+5, 
x’ —352°4+.49x47, x’ —55x°4+77x4-11, x’—65x°4+91x4+13. 


Exercises for Sections 17.1 and 17.2 


In exercises 1—6, use the notation from Definition 17.1 and find m € N and aq),...,a, €C 
such that the given element belongs to the radical extension Q(@,,, @1,..., Qn). 

1. 672-83 

2. 4/5+7V19 — 14/23 

3. 276-3711 

4. 6/10+573— 18/13 

5. ¥6+/23 

6. V8/19-47/51+ 24 


Before doing exercises 7-14 or 15-22, please read the following paragraph: 


In the proof of the insolvability of the quintic, a key step was the fact that if M > K > Q 
are fields such that both M and K are splitting fields of polynomials in Q[x], then the 
homomorphism 


¢ : Gal(M/Q) > Gal(K/Q) 


described in Corollary 15.24 is a surjection. In exercises 7-14 and 15-22, we provide 
examples where M is not a splitting field and ¢ is not surjective. This explains why it was 
necessary to replace the field L in Proposition 17.4 by the splitting field M@ in Theorem 17.7. 


In exercises 7-14, let K = Q(23 , CIS (=) ) and M = Q(25, cis (=) ), Observe that 
MDKDU 


7. Show that K is the splitting field over Q of some f(x) € Q[x]. 
8. Show that M is the splitting field over K of some g(x) € K[x]. 


Insolvability of the Quintic 661 


9. Show that if g € Gal(M/Q), then g(23) = 23. 


10. Show that the homomorphism ¢ : Gal(M/Q) — Gal(K/Q) from Corollary 15.24 is not 
a surjection. 


11. Show that M is not the splitting field over Q of any h(x) € Q[x]. 
12. Determine [K : Q], [M: Q], and[M: K]. 

13. Determine |Gal(K/Q)|, |Gal(M/Q)|, and |Gal(M/K)|. 

14. Determine |Jm(@)| and compare it to |Gal(K/Q)|. 


In exercises 15-22, let K = Q(65 F cis() ) and M = Q(625 ; cis(2) iy Observe that 
MDKDU 


15. Show that K is the splitting field over Q of some f(x) € Q[x]. 
16. Show that M is the splitting field over K of some g(x) € K[x]. 
17. Show that if g € Gal(M/Q), then g(65) = 65. 


18. Show that the homomorphism ¢ : Gal(M/Q) — Gal(K/Q) from Corollary 15.24 is not 
a surjection. 


19. Show that M is not the splitting field over Q of any h(x) € Q[x]. 
20. Determine [K : Q], [M: Q], and[M: K]. 

21. Determine |Gal(K/Q)|, |Gal(M/Q)|, and |Gal(M/K)|. 

22. Determine |/m(@)| and compare it to |Gal(K/Q)|. 


In exercises 23-30, you will need to manipulate the polynomials from Corollaries 17.9 
and 17.10 to produce additional infinite classes of polynomials that are not solvable by radicals. 


23. Show that if p is a prime and n > 2 is an integer, then x° — npx — p is not solvable by 
radicals. 


24. Show that if p is a prime and n > 2 is an integer, then p*x° — npx+ 1 is not solvable by 
radicals. 


25. Show that if p is a prime and n > 2 is an integer, then px° — npx* +1 is not solvable by 
radicals. 


26. Show that if p is a prime and n > 2 is an integer, then x° — npx* + p* is not solvable by 
radicals. 


27. Show that if p is prime, then x’ —5 px° + 7px — p is not solvable by radicals. 


662 Chapter 17 


28. Show that if p is prime, then p°x’ — 5 p°x° +7 px +1 is not solvable by radicals. 
29. Show that if p is prime, then px’ +7 px° — 5 px” + 1 is not solvable by radicals. 
30. Show that if p is prime, then x’ +7 px®° — 5px? + p® is not solvable by radicals. 


In exercises 31-34, we look at some polynomials that, despite having degree at least 5, are 
solvable by radicals. When doing these exercises you may use the fact, which was shown in 
Chapter 10, that polynomials of degree at most 4 are solvable by radicals. 


31. Ifa,b,c,d € Q with a £0, show that ax + bx* + cx? +d is solvable by radicals. 

32. Ifa,b,c,d,e € Q witha ¥ 0, show that ax® + bx® + cx4 + dx? +e is solvable by radicals. 
33. Ifa,b,c € Q with a £0, show that ax® + bx3 +c is solvable by radicals. 

34. Ifa,b,c,d € Q with a £ 0, show that ax? + bx® + cx? +d is solvable by radicals. 


In exercises 35-40, we examine fields of the form Q(,./p1, ./p2;..-,./Pn), where 
P1, P2,--+5 Pn are distinct primes. This will generalize the work we have done on fields 


of the form Q(/2), Q(/7, V/11), and Q(V2, V3, V5). In these exercises, pj, P2,---. Pn 


will be distinct primes, and dj, do, ..., dz» will be the 2” positive integers of the form 
pi poi? +++ prin, where each i; is either 0 or 1. We will also let T = {Jd »Vdo,..., Jdy}. 


35. Ifthe set T is linearly independent over Q, show that 


[OG/pi5./Po.28 ox) Pn) 2 O| =? 


36. Ifthe set T is linearly independent over Q, show that 


IGal(QU/P1; J P2, ++ +5 /Pn)/Q|= 2". 


37. Suppose the set T is linearly independent over Q and also suppose that v € O(,/p1, 
/P2,--+;/Pn) is a linear combination of at least two elements of T. Show that there 
exists g € Gal(Q(,/P1, /P2;--+»/Pn)/Q) such that g(v) € v and g(v) A —v. 


38. If pis a prime that does not belong to the set {p1, p2,..., Pn}, and if the set T is linearly 
independent over Q, show that /p ¢ QU/P1, J P2; +++5./Pn): 


39. Use Mathematical Induction and exercises 35-38 to show that if p1, po,..., Py are 
distinct primes, then 


[Q./ Pi. J Pas --+2/Pn): Q) = 2". 


40. If pi, po,..-, Pn are distinct primes, show that 


Gal(QU/P1; J P25» +++ /Pn)/Q) © Cz x Cz x +++ x C2, 


the direct product of n copies of the cyclic group of order 2. 


Insolvability of the Quintic 663 


17.3 Kronecker’s Theorem 


Included in the subtitle of this book is the phrase “from the integers to the insolvability of the 
quintic.” We conclude this book by coming full circle and returning to the integers. We will 
look at some properties of the integers that can be generalized and applied to more abstract 
settings. 


The proof we presented of the insolvability of the quintic made heavy use of the Fundamental 
Theorem of Algebra. The fact that every nonconstant p(x) € C[x] has a root in C meant that 
when we looked at the roots of f(x) € Q[x], we could look at fields K such thatQC KCC 
and K was a finite extension of Q. Knowing that the splitting field of f(x) lived between Q 
and C greatly simplified the work needed to compute Gal(K/Q). 


On the other hand, it is very common in abstract algebra books to not use the Fundamental 
Theorem of Algebra when proving the insolvability of the quintic. Using that approach, proofs 
of facts about splitting fields and Galois groups become much more abstract and technical. 

An important piece of the puzzle becomes the proof that if K is a field and if f(x) € K[x] has 
degree at least 1, then there exists a field L such that L is a finite extension of K and also 
contains a root of f(x). The main goal of this section is to prove this fact, which is known as 
Kronecker’s Theorem. Its proof will reveal interesting properties of rings as well as show 

us how the integers can be useful in other contexts. 


AS we saw in Chapter 8, if G is a group and H is a subgroup, then the set of left (or right) 
cosets forms a group if and only if H is normal. If R is a ring and S is a subgroup under 
addition then, since R is an abelian group under addition, S is certainly a normal subgroup. 
Therefore, the set of cosets, R/S, is a group under addition. It is reasonable to ask if R/S is 
also a ring. 


Since addition in R/S is defined by adding the names of the cosets, in trying to multiply 
cosets, it would make sense to define coset multiplication as 


(a+S)(b+S)=ab+S. 


In R, we can both add and multiply, but since S is a subgroup under addition, we will 
express cosets using additive notation. In order to check if the multiplication of cosets is well 
defined, we need to determine if changing the names of the cosets can change the product 
when we multiply. 


To this end, suppose a € Rand be S. Sinceb+ S=0+S, if the multiplication of cosets was 
well defined, then we would have 


ab+ S$ =(a+S)(b+S) =(a+8)0+S)=04+S 
and 


ba+S=(b+S)(a+S)=(0+S)(at+S)=0+S. 


664 Chapter 17 


Therefore, if coset multiplication is well defined, it follows that 
ab+ S=0+S=ba+S, 


so ab, ba € S. As we will soon see, this condition is not only necessary, but is also sufficient. 
This motivates 


Definition 17.11. Jf R is a ring, a subgroup I under addition is called an ideal if ab, ba € I, 
forallace R,bel. 


Since the definition of a ring included the commutativity of multiplication, requiring both ab 
and ba to belong to J is redundant. However, the reason we do this is the same reason that the 
definition of a ring includes two distributive laws. Namely, if we drop the assumption of 
commutativity and study noncommutative rings, we need both ab and ba to belong to J for 
the multiplication of cosets to be well defined. In this section, we will primarily concern 
ourselves with the ideals of Z and of F[x], for fields Ff, and will determine the ideals that arise 
in these cases. 


m Examples 


1. The Ideals of Z 


For any ring R, the sets {0} and R are always ideals. If J 4 {0} is an ideal of Z, suppose 
a #0 belongs to I. Then exactly one of a or —a is positive, so J contains at least one 
positive integer. Therefore, the Well Ordering Principle asserts that there is a smallest 
positive integer n belonging to J. If we let nZ = {nb|b € Z}, we claim that J =nZ. 


Since J is an ideal of Z and n € J, it follows that nZ C J. In the other direction, if 
m € I, the division algorithm tells us that there exists g,r € Z such that 


m=q:n+r and 0<r<n. 
Observe that both m and q-n belong to J, so 
r=m—q nel. 


Ifr £0, then r would be an element of J that is both positive and smaller than n. But 
this contradicts the minimality of n, so r =0 and so, 


m=q:-nendZ. 


Asaresult, J CnZ and so, l=nZ. 


If Risaring andre R, we will let (r) = {r-s|s € R}. Using this notation, the 
preceding argument tells us that every nonzero ideal of Z is of the form (1), for some 
n EN. Observe that the set {0} is equal to (0), so every ideal of Z is of the form (n), 


www-.elsevierdirect.com 


Insolvability of the Quintic 665 


for some n € Z. For example, (2) consists of all even integers, (100) consists of all 
multiples of 100, and (1) is the entire set Z. 


We can now exploit similarities between Z and F'[x] to describe all ideals of F[x]. 


m Examples 


2. The Ideals of F[x], where F is a field. 


One of the similarities between the rings Z and F[x] is that they both have a division 
algorithm. We will exploit this to find all ideals of F[x]. If J 40 is an ideal of F[x], 
we will first consider the case where J contains a nonzero constant a € F. In this 
situation, for any f(x) € F[x], 


f(x) = (fda!) -ael, 
so I is equal to all of F[x]. 


On the other hand, suppose J contains no nonzero constants. Then every nonzero 
element of F[x] has positive degree, and the Well Ordering Principle asserts that 
there is a smallest positive degree n from among all elements of J. Now let p(x) € J 
have degree n; we claim I = (p(x)). 


Since J is an ideal, it is certainly the case that (p(x)) C J. On the other hand, if 
f(x) € I, the division algorithm tells us that there exist g(x), r(x) € F[x] such that 


F(x) = qx)» pO) +r) 
and r(x) = 0 or has degree smaller than n. 
Since g(x) - p(x) € I, we now have 

r(x) = fx) —q@)- p@), 
so r(x) € I. It is impossible for r(x) to have positive degree, for that would contradict 
the minimality of the degree of p(x). Hence, r(x) = 0 and 

F(x) = a): p(®) € (PQ). 

Asa result, J C (p(x)), so J = (p(x)). 


For example, (x) equals the set of polynomials with 0 constant term, (x — 5) Is all 
multiples of x —5 and therefore consists of all polynomials with 5 as a root, and (1) 
is equal to all of F[x]. 


www.elsevierdirect.com 


666 Chapter 17 
We can now prove 


Theorem 17.12. Jf Ris a ring and I is a subgroup under addition, we define the 
multiplication of cosets as 

(a+ T)(b+ 1) =ab4+ I, 
for all a,b € R. Then multiplication is well defined and R/I forms a ring if and only if I is an 
ideal. 


Proof. Our earlier discussion showed that if multiplication is well defined, then J must be an 
ideal. Now we will show that if / is an ideal, then multiplication is well defined. In particular, 
we need to show that changing the names of the cosets does not change the answer when we 

multiply. To this end, suppose a, b, c,d € R such that 


a+I=b4+I and c+/J=d-4I, 


we must show that 
(a+ I)(c+1I)=(6+1)d+!). 


We know that a— b,c —d € I, so there exist x, y € J such thata=b+.x andc=d+y. Thus 
ac = (b+ x)(d+ y) = bd+xd+by+xy. 
Since / is an ideal, xd, by, xy € I, hence ac — bd € I. This tells us that ac+ J = bd+ TI. 


The preceding calculations and the definition of coset multiplication combine to tell us that 
(at+I)(c+1)=ac+lT=bd+l=(b4+I)\(d4+IJ), 
as desired. Thus the multiplication of cosets is well defined. 


Since / is a normal subgroup of R under addition, Corollary 8.22 verified that R/J is a group. 
Next we need to show that associativity under multiplication, commutativity under addition 
and multiplication, and the distributive laws are inherited by R/J from R. These properties are 
verified, one at a time, in the following four equations as 
(a+ I)(b+1)(c+1)) = (a+ I) (be+ J) =al(be) +1 = (ab)c+ T= 
(ab+I)(c+1I)=((at+1)(64+))(ct+T); 
(at+I+(+)D=(@4+))+l=(b+a4+1=(64+D+@+); 
(a+1)(b+1) = (ab) +1 = (ba)+1T=(b+1)(at+ J); 
(at I(b+1I)+(c+I))=(@tD(bt+e)+l) =albt+c)+1= (abt+ac)+I= 
(ab+I)+(ac+I) = (t+I)6+1))+ (at D(c+!)), 


for alla, b,c ER. 


Insolvability of the Quintic 667 


Observe that since multiplication in R/J is commutative, we only needed to verify one 
distributive law. On the other hand, if we were dealing with noncommutative rings, we would 
need to verify the other distributive law for R/J to be a ring. 


Finally, 
(at+N0d4+/) =a-14+J=a4+/J=1-a+J=(04+/)(a4+)), 


so 1+ J is the identity element of R/J under multiplication. Thus, we have verified that by 
virtue of inheriting properties from R, R/I does indeed satisfy all the axioms of a ring. 


When G was a group with normal subgroup H, we called groups of the form G/H quotient 
groups. Similarly, if R is a ring with ideal /, we call rings of the form R/J quotient rings. 


It turns out that we are already quite familiar with rings of the form R/J when R = Z. To see 
this, ifn > 1 and J = (n), then the cosets a+ J and b+ / are equal in R/J = Z/(n) if and 
only if a— b is a multiple of n. But that is precisely the condition used to define Z,,. As a result, 
the cosets in Z/(n) are the same as the equivalence classes in Z,. Therefore, rings of the form 
Zy are merely the special case of quotient rings R/J, where R = Z and J = (n). 


Recall that in Z,,, every equivalence class has an infinite number of names. However, the 
division algorithm told us that each equivalence class contains exactly one element from the 
set {0, 1,...,2—1}. Thus, as a convenience, we tend to express elements of Z,, as 


[OJn,C1]n,.--,["#—- Un. 
We will now carry this type of notation to a somewhat similar situation. 


In a previous example, we saw that if R = F[x], where F is a field, and if / is an ideal of R, 
then J = (f(x)), for some f(x) € F[x]. As a result, when we form the quotient ring 

R/I = F[x]/(f()), the cosets a(x) + J and b(x) + J are equal if and only if a(x) — b(x) isa 
multiple of f(x). Since F'[x] has a division algorithm similar to the one for Z, there will be 
some parallels between the cosets in F[x]/(f(x)) and those in Z/(n). 


Let us consider the case where f(x) € F[x] has degree n > 1 and we will examine the coset 
a(xyt+I, 


where a(x) € F[x] and J = (f(x)). The division algorithm tells us that there exist g(x), r(x) 
€ F[x] such that 


a(x) = q(x): fx) +r), 
where r(x) = 0 or deg(r(x)) <n. Since a(x) — r(x) = q(x)- f(x) € (f(x)), it follows that 


a(xytl=r(x)t+l 


668 Chapter 17 


Furthermore, we can write r(x) as @ +ajx+---+@,_;x"—!, where each a; € F. 


In light of the preceding, just as every equivalence class in Z, contains exactly one element 
from the set {0,1,...,72— 1}, we can see that every coset in F[x]/(f(x)) contains exactly one 
element of the form 


ag tayxt-+-+a,—1x", 


where each a; € F’.. As was the case for Z,,, there are an infinite number of names for each 
equivalence class in F'[x]/(f(x)). However, as a convenience, we can express each coset in 


F[x]/(f(x)) as 
[oo taixt-+++ay1x" |] fay, 


where each a; € F. 


Let us now look at the set 


[LF] pa) = {lol poy |e € FY S FL x]/(f@)). 


Observe that the set [ F’'] s(x) is essentially a copy of the field F, only the elements of F look 
slightly different in this context. The advantage of introducing the set [F'] x), is that we can 
now view F as a subset of F[x]/(f(x)). By expressing elements a € F as [a] 4), we can now 
consider F[x]/(f(x)) to be a commutative ring which contains the field F. We will now 
examine some examples of rings of the form F[x]/(f(x)). 


m Examples 


1. Let R= Q[x] and J = (x? —2), then every element in R/J = Q[x]/(x? — 2) can be 
written in the form [a+ bx],2_5, where a,beQ. 


Addition in Q[x]/(x? — 2) is quite straightforward, and we have 
[a+ bx],22 + [e+ dx],2_2 =[(at+c) + (6+4)x],2_2, 


where a, b,c,d € Q. 


In Q[x]/(x? — 2), x? and 2 belong to the same coset as their difference is x? —2. Thus, 
in Q[x]/(x* — 2), any multiple of x” can be replaced by the corresponding multiple of 
2. As a result, we have 


[a+ bx],2_5-[e+dx],2 5 = (a+ bx) (c+ dx)],2_. = [ac + (ad + be)x+ (bd) x7],2_5 


= lac + (ad + bc)x + (bd) (2)],2~-2 = [(ac + 2bd) + (ad + be) x],2_9. 


www-.elsevierdirect.com 


Insolvability of the Quintic 669 
Summarizing this, ifa, b, c,d € Q, then 
[a+ bx],2_5- [e+ dx],2_. = (ac +2bd) + (ad + bc)x],2_9. 
For example, in Qx] /(x? — 2), we have 
[(2—3x],2_>:(44+2],2_5 = [12 — 10x — 3x7] ,2_5 = [6— 10x],2_5, 


[1+2x],2-5-[3 —5x],2_. = [3 +x—10x7],2_5 =[-17+4],2_9, 


2 1 8 lL 4 
4 \o=2%| sl2-22| =f». 
esa Gat). Erie), 


Observe that the last example indicates that 


>» 4 
a Te lies 


is the multiplicative inverse of [4+ x],2_2. In fact, if 
[a+ bx],2_5 £0, 


then 


_ x 
a—2b? a*—2b* J} 


is the multiplicative inverse of [a+ bx],2_5. To check this, you first need to multiply 
a b . 

[a+ bx],2_. and | -= = Fo | 5 and see that you obtain [1],2_ as the 

answer. But you also need to check that if 


[a+ bx],2-2 #9, 


then the fact that /2 is not rational implies that a? — 2b? #0. 


Asa result, Q[x]/(x? — 2) is not only a commutative ring but it is also a field. 
Therefore Q[x]/(x? — 2) is a field extension of Q of degree 2 as the set 
{[1],2_2, [x],2_2} is a basis for Q[x]/(x? — 2) over Q. 

Next, consider the polynomial 


p(T) = T? —2€ QUT]. 


Clearly, the field Q does not contain any roots of T? —2. But what about the field 
Qlx]/(x? — 2)? Recall, in this context, we write elements of Q in the form [a],2_5, 


www.elsevierdirect.com 


670 Chapter 17 
where a € Q. As a result, in this context, we should rewrite p(T) as T? — [2],2_>. 
When we plug [x],2_ into p(T), we obtain 
P([x],2~2) = [x],2-2" — 2y2_2 = [2 ly2-2 — [212-2 = 
[x? —2],2-9 = [0],2_2. 


Thus, not only is Q[x]/(x? — 2) a field extension of Q, but it also contains a root of 
p(T) =T*-2€ Q(T]. 


2. Let R= R[x] and J = (x*+1), then we can write every element of R/J = 
R[x]/(x? +1) in the form [a+ bx],241, where a,b ER. 


It is easy to see that when we add elements in R[x]/(x* + 1) we have 
[a+ bx],241 + [e+ dx],24; =[ato) + (O+4)x],241, 


where a,b,c,d ER. 


In R[x] /(x? +1), x? and —1 belong to the same coset as their difference is x? 41: 
Thus, in R[x]/(x7+ 1), we have 


[a+ bx],241-[e+dx],241 =[(a+bx)(c+dx)],241 = [ac + (ad + be)x + (bd)x7),244 
= [ac + (ad +be)x+ (bd)(—V]241 = [ac — bd) + (ad + be)x]y241. 
Summarizing, if a, b,c, d € IR, we have 
[a+ bx],241-[e+dx],241; =[(ac — bd) + (ad + bc)x],241. 
For example, in R[x]/(x? + 1), we have 
[3+ 4x]241-[2—axl 41 = [64+5x— 4x7} 241 = 0+ Sx}, 
[1+ 2x].24) (5 — 7x41 = (54+ 3x— 1424] ,24) = [194+ 3x],241, 


[2+7x] E ! | E = | (1) 
XI241°) By 7 EQ =!|m 7m = [1] 24 
58: 38 Nga (158 38° [uae 


The last example illustrates that 


e 


is the multiple inverse of [2 + 7x],2,,;. More generally, if [a+ bx],2,, 40, then at least 
one of a or b is nonzero. As a result, a” +b” 4 0, and it then is easy to check that 


a b 
— x 
+h ah? | a4 


www-.elsevierdirect.com 


Insolvability of the Quintic 671 


is the multiplicative inverse of [a+ bx],2,,. Therefore, R[x]/(x? +1) isa field 
extension of IR of degree 2 and the set {[1],241, [],241} is a basis for R[x]/(x7 +1) 
over R. 


If we look at the polynomial 
q(T) =T*+1e€RIT], 


then we know that R does not contain any roots of T? + 1. However, we will now 
consider the field R[x]/(x? + 1). In this context, we write elements of R in the form 
[a],24.1, where a € R. We now rewrite q(T) as T* +[1],241, and when we plug [x],241 
into q(T), we now obtain 


q ((x]e41) = Delegr? + Weg = 7g + eg = 


[x* + 1].241 = [0],241. 


Thus, R[x]/(x? + 1) is a field extension of R that also contains a root of g(T) = 77 + 
1 Ee R[T]. 


In the preceding two examples, F[x]/(f(x)) turned out to be a field that was a finite extension 
of F. The polynomials p(7) and q(T) in these examples were really the original polynomial 
f(x), with the variable x replaced by the new variable T. In both cases, f(T) did not have a 
root in F, but [x] x) was a root of f(T) in F[x]/(f(x)). However, before going any further in 
this direction, we need to observe that not all rings of the form F[x]/(f(x)) are fields. To see 
this, we have 


m Example 


Let R = Q[x] and J = (x? —9), then every element in R/I = Q[x]/(x? — 9) can be written 
in the form [a+ bx],2_9, where a, b € Q. Since x” and 9 belong to the same coset, if 
a,b, c,d € Q, we have 


[a+ bx],2_9-[ce+dx],2_9 = [ac+ (ad + bce)x+ (bd) x] ,2_9 = 


[(ac + 9bd) + (ad + bc) x] ,2_9. 
In particular, 
[3+x],2-9-[-—3+x],2-9 = [—9 +27],2_-9 = [0] 29. 


As a result, [3 +x],2-9 and [—3+x],2_9 are nonzero elements of Q[x]/(x? — 9) whose 


product is [0],2_9. Thus, [3 + x],29 and [—3+ x],2_9 are zero divisors, so Q[x]/(x? — 9) 
is not a field. 


www.elsevierdirect.com 


672 Chapter 17 


In our previous three examples, Q[x]/(x? — 2) and R[x]/(x?+ 1) are fields, whereas 

Q[x]/(x? — 9) is not a field. If we wish to determine precisely when rings of the form 
F[x]/(f(x)) are fields, we will again exploit the parallels between F[x] and Z. Recall, that in 
Chapter 7 we used the Euclidean Algorithm to show that Z,, was a field if and only if n was 
prime. Observe that there is also a Euclidean Algorithm in F[x], and irreducible polynomials 
play the role in F[x] that primes play in Z. This observation, along with a look back at our 
three examples of rings of the form F'[Lx]/(f(x)), is evidence that F[x]/(f(x)) should be a field 
precisely when f(x) is irreducible in F[x]. We record this as 


Theorem 17.13. Let f(x) € F[x] have degree at least one. Then the ring F[x]/(f(x)) is a 
field if and only if f(x) is irreducible in F(x]. 


Proof. In one direction, suppose f(x) is not irreducible in F[x]. Then we can write f(x) = 
a(x) -b(x), where both a(x) and b(x) are elements of F[x] having smaller degree than f(x). As 
a result, neither a(x) nor b(x) can be multiples of f(x), so 


la(x)] xy) and = [b(x)] fon 
are both nonzero elements of F'[x]/(f(x)). 


However, 
La(x)] poy [BOO] oy = LAO) - OO) ] fog = [FO] poy = [0] fon. 


As a result, [a(x)] fx) and [b(x)] f(x) are zero divisors in F[x]/(f(x)), so F[x]/(f(x)) is not a 
field. 


In the other direction, suppose f(x) is irreducible in F[x]. We need to show that if a(x) € F[x] 
such that [a(x)] fa) 4 0 in Flx]/(f(x)), then [a(x)] fa is invertible in F[x]/(f(x)). Since 
f(x) is irreducible in F'[x] and a(x) is not a multiple of f(x), it follows that a(x) and f(x) are 
relatively prime in Fx]. The Euclidean Algorithm in Fx] asserts that there exist r(x), 

s(x) € F[x] such that 


r(x)-a(x) + s(x)- f(x) = 1. 
Therefore, when we multiply in F[x]/(f(x)), we have 
[r(x)] poy La) fog = (1) a) fog = - 5) > FOO] fey = FD co. 


Thus, [7(x)] soy 1s the multiplicative inverse of [a(x)] fa) in Flx]/(f(x)) and it is indeed the 
case that every nonzero element of F[x]/(f(x)) has a multiplicative inverse. Hence, 
F[x]/(f(x)) is a field. 


Based on the discussion that preceded Theorem 17.13, we can now prove the main result of 
this section. 


Insolvability of the Quintic 673 


Theorem 17.14—Kronecker’s Theorem. /f F is a field and f(T) € F[T] has degree n > 1, 
then there exists a field K containing F such that K contains a root of f(x) and K has 
degree at most n over F. 


Proof: Since f(T) has degree at least one, f(T) can be written as a product of irreducible 
polynomials in F[T]. If we let p(T) € F[T] be an irreducible factor of f(T) and if we let m 
denote the degree of p(T), then clearly m <n and any root of p(T) is also a root of f(T). 
Therefore, it suffices to find a field K containing F such that K contains a root of p(T) and has 
degree at most m over F. 


If we let p(x) be the same polynomial as p(7), except with the variable x, Theorem 17.13 tells 
us that the ring Fx]/(p(x)) is a field. We also know that every element of F[x]/(p(x)) can 
be written as 


[org Faryx tees +Om—1x"—'Tycx), 


where each a; € F. Since elements of F are represented in F[x]/(p(x)) in the form [a] p(), 
where a € F, we now have 


[agp t+ayx+--- + m1" Tyce) = 
[oro] pox) * ( peay + Lott poe « 4] poy + °° + [tm] poo "Ip. 
This last equation tells us that the set 
(Mess laGien si bes} 


spans F[x]/(p(x)) over F. Thus, the degree of F[x]/(p(x)) over F is at most m. Although it is 
not needed for this proof, it is not hard to see that the set {[1] ,~), 4] pq, ---; Esuameal rest is 
also linearly independent over F. Thus, the degree is F[x]/(p(x)) over F is exactly m. 


In light of the preceding, it now suffices to show that F'[x]/(p(x)) contains a root of p(T). 
Based on our examples, [x] p(.) is the obvious candidate. Using our representation of elements 
of F as cosets in F[x]/(f(x)), we know that p(T) can be written as 


p(T) = [Bol pw Se [Bm—1] po) 5 ps + [Bin] pcx) . cS 


where each 6; € F and 6, #0. Using the fact that [p(x)] pi) = [0] pq) in F[x]/(p()), when 
we plug our candidate [x],(,) into p(T), we obtain 


P(x] poy) = [Bol poy +°°* +L Bm poo * Le] poy)” + [8m] poy « Le pe)” = 


[Bo+::: Baca + Bix” I pix) = [P(x)] pox = LO] px). 


Thus, [x] p(x) is indeed a root of p(T), as required. 


674 Chapter 17 


When we compare the fields Q[x]/(x? — 2) and Q(/2), they appear to be essentially the same 
field. Elements of both fields look like polynomials of degree at most one with rational 
coefficients. The field Q[x]/(x? — 2) contains the element [x],2_, which has the interesting 
property that its square is equal to 2, whereas the element /2 € Q(V/2) plays an analogous 
role in Q(V2). Similarly, the fields R[x]/ (x* + 1) and C seem to be essentially the same field. 
The element [x],2,, plays the same role in R[x]/(x* + 1) that the element i plays in C. 


In Chapter 8, when two groups were essentially the same group, except that the elements 
might have different names, we said that the groups were isomorphic. More formally, we 
defined two groups to be isomorphic when there existed a homomorphism between them that 
was also a bijection. In Chapter 9, we introduced homomorphisms of rings and used this 
concept to prove various irreducibility criteria. In an attempt to formalize the notion of two 
different rings or fields being essentially the same algebraic object, we have 


Definition 17.15. Rings R, and R» are said to be isomorphic if there exists a ring 
homomorphism @ : Ry — R2 which is also a bijection. 


It is natural to wonder if there are straightforward ways to show that Q[x]/ (x? — 2) and Q(V2) 
are isomorphic and also that R[x]/(x? + 1) and C are isomorphic. The tool that we will use 
will be the analog for rings of Theorem 8.39(a)—Isomorphism Theorem for Groups. 


Theorem 17.16—Isomorphism Theorem for Rings. Let @:R, — R2 be a homomorphism 
of rings. 
(a) Ker(@) = {r € R,| (71) =O} is an ideal of R. 


(b) Im(¢) = {¢(r)|r € Rj} is a ring. 
(c) The rings R,/Ker(@) and Im(@) are isomorphic. 


Proof. For part (a), since ¢ is a homomorphism of groups, Theorem 8.24(c) tells us that 
Ker(@) is a subgroup under addition of R,. In this situation, 0 represents the additive identity 
of Rz and we know that multiplication by 0 in R2 always results in 0. To check that Ker(@) is 
an ideal of R,, if we let a € Ker(@) andr € Rj, we have 


p(ar) = oa) o(r) = 0-¢(7) = 0 
and 
p(ra) = O(1)O@ = G(r) -0=0. 
Therefore, ar, ra € Ker(@), hence Ker(@) is an ideal of R1. 
For part (b), since ¢ is ahomomorphism of groups, we again cite Theorem 8.24(c), this time to 


assert that Jm(@) is a subgroup of R2 under addition. Furthermore, since Jm(@) is a subset of 
the ring R2, we know that the associative and commutative laws of multiplication and the 


Insolvability of the Quintic 675 


distributive laws hold when dealing with elements of /m(@). Therefore, in order to verify that 
Im(@) is aring, it suffices to show that Jm(@) is closed under multiplication and also 
contains a multiplicative identity. 


Observe that if 7, s € Im(@), then there exist a, b € Ry such that r = ¢(a) and s = ¢(b). 
Therefore, 


rs = (a)$(b) = b(ab) € Im(9), 
so Im(@) is closed under multiplication. 


If we let 1 denote the multiplicative identity of R1, then #(1) is the likely candidate to be the 
multiplicative identity of R2. To check that it is, if r € Im(@), then r = ¢(a), for some 
aé R,, and we have 


as well as 
dU)r=¢C)d(a) = 601 -a) = ¢(~— =r. 


Thus, r@(1) =r = ¢@(1)r. Therefore, #(1) is the multiplicative identity of Jm(@) and Im(@) is 
a ring. 


Since R» is commutative, it was not necessary to show that both r@(1) and @(1)r were equal 
to r. However, by providing the extra details, we are actually providing a proof that also works 
for noncommutative rings. 


For part (c), most of the proof is already taken care of by the proof of Theorem 8.39(a)— 
Isomorphism Theorem for Groups. In particular, that proof tells us that if we define 


v: Ri /Ker(¢) > Im(@) 


as 


v(a+ Ker(¢)) = $(a), 
for all a € R,, then v is an isomorphism between the groups R;/Ker(@) and Im(@). 


As a result, it now suffices to show that v preserves multiplication. To this end, if we are given 
two elements of R, /Ker(@), then we can represent them as a+ Ker(@) and b+ Ker(@), 
where a, b € Rj. Since ¢ is ahomomorphism of rings, we know that it preserves multipli- 
cation, so 


v((a+ Ker(p)) - (b+ Ker())) = v(ab+ Ker(¢)) = o(ab) = 
p(a)o(b) = vat Ker(¢))- v(b+ Ker()), 


as required. 


676 Chapter 17 


We will conclude this section by using the Isomorphism Theorem for Rings to further examine 
the quotient rings Q[.x]/(«? — 2), R[x]/(x? + 1), and Q[x]/(x? — 9). 


m Examples 


1. Letd: Q[x] > Q(V2) be defined as #( f(x)) = f(Vv2), for all f(x) € Q[x]. Observe 
that if f(x), g(x) € Q[x], we have 


(fx) + g(®) = fW2) + g(V2) = (fa) + O(8@)) 
and 
(fx) - g(x) = fiV2)- g(V2) = o(f@))-6(g()). 
As a result, ¢ is a homomorphism of rings. Ifa, b € Q, then 
o(a+bx) =a+bv2, 


so @ is certainly surjective. 


Corollary 12.7 tells us that the elements of Q[x] that have /2 as a root are 
precisely the multiples of x? — 2. Observe that this is equivalent to saying that 
Ker(@) = (x? —2). Since we know that Im(#) = Q(/2) and Ker(¢) = (x —2), 
Theorem 17.16(c) asserts that 


Qlx]/ (x? — 2) © Q(V2). 
2. Let @: R[x] — C be defined as $( f(x)) = f(i), for all f(x) € R[x]. Using a short 


argument almost identical to one in the previous example, we can see that ¢ is a 
homomorphism. Indeed, if f(x), g(x) € R[x], we have 


b(F(®) + 8(%)) = FO + BO = (FA) + O(8@)) 
and 
(F(X): 8(x)) = fO- 8M = O(f@)) - $(g@)). 
Ifa,b € R, then 
(at bx) =a+bi, 


so @ is surjective. Corollary 12.7 tells us that the elements of R[x] that haveiasa 
root are precisely the multiples of x? + 1. This immediately tells us that 
Ker(@) = (x* +1), so Theorem 17.16(c) tells us that 


R[x]/Q? +1) *C. 


www-.elsevierdirect.com 


Insolvability of the Quintic 677 


Let &: Q[x] > Qx Q be defined as ( f(x)) = (f(3), f(—3)), for all f(x) € Q[x]. The 
ring Q x Q consists of all ordered pairs of rational numbers where addition and 
multiplication are done componentwise. In particular, if a,b, c,d € Q, then 


(a,b)+ (c,d) =(a+c,b+d) and (a,b)-(c,d) = (ac, bd). 


Observe that in Q x Q we have (1, 0)- (0, 1) = (0, 0). Therefore, Q x Q has zero 
divisors and thus cannot be a field. 


If f(x), g(x) € Q[x], we have 
(f(x) + 8(x)) = (f(3) + 83), f(—3) +. 9(—3)) = 


(£3), A—3)) + (83), 8(—3)) = OF) + O(8()) 


and 
(f(x): 8(x)) = (fB3)-8@), f(—3)- 9(—-3)) = 
(73), (3) (g—3), a(—3)) = 6 GO) -Ple@)), 


Thus, ¢ is a homomorphism of rings. 


Ifa, b € Qthen, when we let f(x) = (=) x+ ath it is easy check that f(3) =a and 
f(—3) = b. Thus, 


b(f()) = (FB), f(-3)) = @ 4), 


which shows that @ is surjective. 


Next, if f(x) € Q[x], it is not hard to see that the three conditions of having 3 and —3 
as roots, being a multiple of x7 —9, and belonging to Ker(@) are all equivalent. 
Therefore, Ker() = (x* —9), and Theorem 17.1 6(c) now asserts that 


Qhx/@?-9) + QxQ. 


In an earlier example, we showed that Q[x]/(x? — 9) was not a field because the 

cosets containing 3+ x and —3+.x are nonzero, yet their product was the coset 

containing 0. Since @ is an isomorphism, the product of @(3 + x) and ¢(—3 + x) 
should now be the additive identity of Q x Q, which is (0, 0). Observe that this is 
indeed the case as 


o(3+-x)-¢(—3+ x) = (6, 0)- (0, —6) = (0, 0). 


www.elsevierdirect.com 


678 Chapter 17 


Exercises for Section 17.3 


1. 


10. 


11. 


Let R be a commutative ring. If r € R, show that the ideal (r) = {rs|s € R} is equal to all 
of R if and only if r has a multiplicative inverse in R. 


Suppose n, m are nonzero integers such that the ideals (n) = {nb|b € Z} and (m) = 
{mb|b € Z} are equal. What can you say about the relationship between n and m? 


Suppose f(x), g(x) are nonzero elements of F[x], where F is a field, such that the ideals 
(f(x) = (fx) h(x) |A(x) € F[x]} and (g(x)) = {g(x) h(x) |A(x) € F[x]} are equal. 
What can you say about the relationship between f(x) and g(x)? 


Using the notation of exercise 2, suppose n,m are nonzero integers such that (n) C (m). 
What can you say about the relationship between n and m? 


Using the notation of exercise 3, suppose f(x), g(x) are nonzero elements of Fx], where 
F is a field, such that (f(x)) € (g(x)). What can you say about the relationship between 


f(x) and g(x)? 


Let J, J be ideals of a commutative ring R. If 1+ J = {a+blae Ibe J}, show that 
I+ J is also an ideal of R. 


Let J, J be ideals of a commutative ring R. Show that 7M J is also an ideal of R. 


Using the notation of exercises 2 and 6, if n, m are nonzero integers and c = gcd(n, m), 
show that (n) + (m) = (c). 


Let f(x), g(x) be nonzero elements of F[x], where F is a field. Using the notation of 
exercises 3 and 6, if c(x) = gcd( f(x), g(x)), show that (f(x)) + (g(x)) = (c(x)). 


Using the notation of exercise 2, if n,m are nonzero integers and d = Icm(n, m), show 
that (n) N(m) = (d). 


Using the notation of exercise 3, if f(x), g(x) are nonzero elements of F[x], where F is a 
field, and d(x) = Ilcm( f(x), g(x)), show that (f(x)) NM (g(x)) = (d(x)). 


Before doing exercise 12, please read the following: 


In Chapter 12, we saw that a huge difference between Z[x] and F[x], where F is a field, 

is that there is a division algorithm in F[x] but not in Z[x]. In this chapter, we saw that every 
ideal of F[x] is of the form (f(x)) = { f(x) -h(x)|h(x) € F[x]}, for some f(x) € F[x]. Since 
the proof of this relied on the division algorithm, it may not come as much of a surprise that 
there exist ideals of Z[x] not of the form (g(x)) = {g(x)- k(x) | k(x) € Z[x]}. 


12. 


In Z[x], using the notation of exercise 6, let J = (x) + (2). We can also think of J as 
{ f(x) € Z[x]| f(O) is an even integer}. Show that there does not exist any g(x) € Z[x] 
such that J = (g(x)). 


Insolvability of the Quintic 679 


In exercises 13-18, we will be working in the field K = Q[x]/(x* +6x — 2). In your answers 
to these exercises, please write elements of K in the form [a+ bx],216,—2, where a,b € Q. 


13. Compute [8 — 5x],246,~2-[2+32],246,-2- 
14. Compute [—6 + 7x],246,—2° [11 —44],246,_2. 


15. Find the general formula for the product [a+ bx],246,—2: [e+ dx],246,—-2, where 
a,b,c,d€éQ. 


16. Ifa,b¢ Q such that [a + bx],246,—2 is not zero, find ([a+ bx],246,.—-2) |. 
17. Show that both [x],246,—2 and [—6 — x],246,—2 are roots in K of T?-+6T —2 € Q(T]. 
18. Find two elements in K whose square is [11],246,—2. 


In exercises 19-24, we will be working in the field L = Q[x]/(x? —7). In your answers to 
these exercises, please write elements of L in the form [a+ bx + cx? ] 43-7, where a,b,c € Q. 


19. Compute [2 —5x],3_7- [64 3x7],3_7. 
20. Compute [5x — 7x7],3_7-[—5 + 2x],3_7. 
21. Compute [1 + 4x — x?],3_7- [8 —7x +5x7],3_7. 


22. Find the general formula for the product [a+ bx + cx7],3_7-[d +ex + fx7],3_7, where 
a,b,c, d,e, f EQ. 


23. How many roots of T*? —7 € Q[T] belong to L? 
24. Show that [—2+ x],3_7 is a root of T?+677+ 127 +1 € Q[T]. 


In exercises 25-30, we will be working in the commutative ring R = Q[x]/(x? — 25). In your 
answers to these exercises, please write elements of R in the form [a+ bx],2_25, where 
a,b,ceQ. 


25. Compute [3 + 2x],2_95-[—11 —4x],2_95. 

26. Find the general formula for the product [a+ bx],2_25 -[c+dx],2_25, where a, b,c,d€Q. 
27. Find an element r € R such that [7 — 8x],2_5-r = [1],2_95. 

28. Finda nonzero element s € R such that [15 — 3x],2_95-5 = [0],2_95. 

29. Show that every nonzero element of R is either invertible or a zero divisor. 

30. Show that [a+ bx],2_5 is invertible in R if and only if a is equal to neither 5b nor —5b. 


31. Show that Zo[x]/ (x? +x+[1]2) is a field with 4 elements. 


680 Chapter 17 


32. Show that Zo[x]/ (x3 +x-+[1]) is a field with 8 elements. 
33. Show that Z3[x]/(x? + [1]3) is a field with 9 elements. 
34. Show that Z3[x]/ (x? + [2]3x +[1]3) is a field with 27 elements. 


In exercises 35-40, we examine finite fields. In exercises 35-37, you will show that if F isa 
finite field, then || = p”, where p is prime and n € N. Conversely, in exercises 38-40, you 
will show that given a prime p andn EN, there exists a field F with p” elements. 


35. If F isa finite field, show that F has characteristic p > 0. 


36. If F has characteristic p > 0, show that F contains a field E which is isomorphic to the 
field Zp. 


37. If F isa finite field, show that F has p” elements, where p is the characteristic and n is 
the dimension of F over the subfield isomorphic to Zp. 


38. If L is any field and f(x) € L[x] has degree n > 1, show that there exists a field M that 
contains L and also contains n roots, counting multiplicities, of f(x). 


39. If pis a prime number and n € N, show that there exists a field K which contains Z, and 
also contains p” different roots of the polynomial x?" — x € Z p[x]. (Technically, in 
Zp[x], this polynomial is written as [1],x? +[p— 1]px.) 


40. Let K be as in exercise 39 and let F = {r € K |r?” =r}. Show that F is a field with 
exactly p” elements. 


41. Let @: Q[x] — C be the ring homomorphism defined as ¢( f(x)) = fG — 24), for all 
f(x) € Qi]. 


(a) Describe, as simply as possible, the elements of Ker(@). 
(b) Describe, as simply as possible, the elements of Jm(@). 


(c) Based on the Isomorphism Theorem for Rings and parts (a) and (b), which two 
rings can now be seen to be isomorphic? 


42. Let 6: C[x] ~ C@C be the ring homomorphism defined as $(g(x)) = (g(5), g(—2)), 
for all g(x) € C[x]. 


(a) Describe, as simply as possible, the elements of Ker(@). 
(b) Describe, as simply as possible, the elements of Jm(@). 


(c) Based on the Isomorphism Theorem for Rings and parts (a) and (b), which two 
rings can now be seen to be isomorphic? 


Insolvability of the Quintic 681 


43. Let d: Q{x] ~ C@C be the ring homomorphism defined as #(h(x)) = (A(V3), h(i), 
for all h(x) € Q[y]. 
(a) Describe, as simply as possible, the elements of Ker(@). 


(b) Describe, as simply as possible, the elements of Jm(@). 


(c) Based on the Isomorphism Theorem for Rings and parts (a) and (b), which two 
rings can now be seen to be isomorphic? 


44. Let d: Q(x] ~ R@RGR be the ring homomorphism defined as $( f(x)) = (fV/5), 
f(1), f(—9)), for all f(x) € QL]. 


(a) Describe, as simply as possible, the elements of Ker(@). 
(b) Describe, as simply as possible, the elements of Jm(@). 


(c) Based on the Isomorphism Theorem for Rings and parts (a) and (b), which two 
rings can now be seen to be isomorphic? 


In exercises 45-72, we will try to better understand fields of the form Q(@), where 
o= cis(22), for some n > 1. Our goals will be to 

(a) Find [Q(@) : Q] and the minimum polynomial for w over Q. 

(b) Determine the structure of Gal(Q(w)/Q). 


Throughout these exercises, n will be a fixed positive integer and w will denote cis(22). We 
say that y € C is a primitive nth root of 1 if y” = 1 and 7“ ¥ 1, for every positive integer d 
which is less than n. Observe that w is always one of the primitive nth roots of 1. For any 

m EN, let W,, be the set of all primitive mth roots of 1. Next, we define the nth cyclotomic 


polynomial to be 


Wn) = Igew, («— 9). 
Thus, the roots of W,, (x) are precisely the primitive nth roots of 1. For example, 


Wiw=x-1l, WwW=x-(ClD=x4+l, 


W3(x) = (x-cis()) (x-cis()) = 5 tx 1, 


We will also let 1 = d, < dy <--- <d;_; < d,; =n be all the positive divisors of n. 


and 


45. In Chapter 7, we introduced the Euler ¢-function. If m;, mz, ..., mgm) are the @(n) 
positive integers which are less than n and relatively prime to n, show that 
ow”, ow”, ...,@"™ are the primitive nth roots of 1. 


682 Chapter 17 


46. 
47. 


48. 


49. 


50. 


51. 


If d; # dj; are divisors of n, show that Wg, Wa, = 9. 


Show that union of sets, W; U Wg, U---U Wa,_, U Wy, is equal to set of all the roots in C 
of x” —1. 


Use exercises 46 and 47 to show that 
x" —1 = Wy (x)- Wg, (x) ++» Ya,_, (x) - Yara). 
Use the previous exercise and the Second Version of Mathematical Induction to prove 


that W,, (x) € Q[x], for every n EN. 


Having already shown that W,,(x) € Q[x], now show that W,,(x) € Z[x]. You might want 
to think about the ideas used toward the end of the proof of Gauss’ Lemma. 


If p is a prime number, show that V,(x) = ge AP Peo ae Ae (), 


Before doing exercises 52—57, observe that in light of the formula in exercise 48, the easiest 


way to find W,, (x) is often to use the previously computed Yy(x), for divisors d of n. 


52. 
53. 
54. 
55. 
56. 
57. 
58. 


Find W(x). 
Find W(x). 
Find W(x). 
Find W(x). 
Find W(x). 
Find W19(x). 


Use Gauss’ Lemma to show that there exist f(x), g(x) € Z[x] such that x” —1= 
f(x): g(x), f(x) is irreducible in Q[x], and f(w) = 0. 


In all the remaining exercises, f(x) and g(x) will refer to the polynomials in exercise 58. In 


these exercises, y € C will denote a root of f(x) and p will be a prime number that does not 


divide n. 

59. Show that if y? is not a root of f(x), then y? is a root of g(x) and y is a root of g(x?). 

60. Assume once again that y? is not a root of f(x) and then show that g(x?) is a multiple 
of f(x). 

61. Let p: Z[x] > Z,[x] be the ring homomorphism used in the proof of Gauss’ Lemma. 


Observe that 


[1 px" +[p—1]p = pQ" — 1) = e(f(%)- g(x) = p(f@))- p(g(x))- 


62. 


63. 


64. 
65. 


66. 
67. 
68. 
69. 
70. 
71. 
72. 


Insolvability of the Quintic 683 
Show that p(g(x))? = p(g(x?)), and then show that if y? is not a root of f(x), then 
p(g(x))? is a multiple of p( f(x)). 


Kronecker’s Theorem tells us that there exists a field K containing Z, that contains a 
root r of o( f(x)). If y? is not a root of f(x), show that r is also a root of p(g(x)). 


Use Proposition 12.22 to show that [1],x” + [p— 1], has no multiple roots in any field K 
containing Z,[x]. 


Use exercises 62 and 63 to show that y? must be a root of f(x). 


If d> 1 is relatively prime to n, show that the prime factorization of d consists of primes 
that do not divide n. 


Use exercises 64 and 65, to show every primitive nth root of 1 is a root of f(x). 

Use exercise 66 to show that the degree of f(x) is at least as large as the degree of VY, (x). 
Show that WY, (x) is irreducible in Q[x]. 

Show that [Q(@) : Q] = d(n). 

Show that Q(q@) is the splitting field of x” — 1 over Q. 

Show that Gal(Q(w)/Q) © U(Z,,) is the group of invertible elements in Z,,. 

Describe the behavior of each element of Gal(Q(w)/Q) on a. 


This page intentionally left blank 


Bibliography 


R.B.J.T. Allenby, Rings, Fields and Groups, Edward Arnold, London, 1991. 

E. Artin, Galois Theory, Notre Dame University Press, Notre Dame, 1944. 

M. Artin, Algebra, Prentice Hall, New Jersey, 1991. 

J. Beachy and W. Blair, Abstract Algebra, Waveland Press, Illinois, 2006. 

GD. Birkhoff and S. MacLane, A Survey of Modern Algebra, Macmillan, New York, 1941. 
M. Hall, The Theory of Groups, Macmillan, New York, 1959. 


I.N. Herstein, Noncommutative Rings, Carus Mathematical Monographs No. 15, 
Mathematical Association of American, Washington, D.C., 1994. 


I.N. Herstein, Topics in Algebra, John Wiley & Sons, New York, 1975. 
I. Kaplansky, Fields and Rings, University of Chicago Press, Chicago, 1969. 
T.Y. Lam, A First Course in Noncommutative Rings, Springer-Verlag, New York, 2001. 


I. Niven, Irrational Numbers, Carus Mathematical Monographs No. 11, Mathematical 
Association of American, Distributed by John Wiley & Sons, New York, 1956. 


I. Niven and H. Zuckerman, An Introduction to the Theory of Numbers, John Wiley & Sons, 
New York, 1960. 


D.S. Passman, A Course in Ring Theory, Wadsworth and Brooks/Cole, California, 1991. 
J. Rotman, An Introduction to the Theory of Groups, Springer-Verlag, New York, 1995. 
L. Rowen, Ring Theory: Student Edition, Academic Press, Boston, 1991. 

T. Shifrin, Abstract Algebra: A Geometric Approach, Prentice Hall, 1996. 

I. Stewart, Galois Theory, Chapman & Hall/CRC, Florida, 2004. 


685 


This page intentionally left blank 


A 


Abelian groups, 270 
cyclic groups and, 277 
factor groups as, 325, 341 
Galois group as, 606-607 
Galois groups of splitting fields 
and, 607-608 
isomorphism and, 288 
solvable groups and, 332 
Abstract algebra, 1-2 
field extensions’ complexity in, 
580 
functions in, 46 
functions in calculus compared to, 
46-47 
teachers and, 13 
Acute angles, trisecting, 21-22 
Addition. See also Associative law 
of addition; Commutative law 
of addition 
associative law of, 140 
in basic arithmetic, 125 
commutative law of, 140 
complex numbers and, 138 
of cosets, 303 
for decimal expansion, 125 
of equivalence classes of arrows, 
193 
groups, commutative rings/fields 
under, 266 
of ordered pairs, 529-530 
of rational functions, 470 
of rational numbers, 118—122 


of rational numbers as well defined, 


120-121 
of real numbers in geometry, 
190-191 
in Z,,, 229-232 
Additive identity, 140 
commutative rings and, 161-163 
complex numbers and, 144 
polynomials with, 530 


of scalars, 535 
of vectors, 535 
Additive inverses, 140 
polynomials and, 530 
Algebra, 1-2. See also Abstract 
algebra; Fundamental Theorem 
of Algebra 
context’s importance in, 439 
linear equation systems in, 6 
polynomials and, 2-6 
representation theory in, 221 
for roots of polynomials, 15-16 
Algebraic number(s), 431 
cosine as, 431-432 
real numbers as, 431 
sine as, 431-432 
trigonometric functions as, 433 
Angles. See also Impossibility of 
trisecting angles 
bisecting, 639-640 
ruler and compass, trisecting, 
20-22 
60°, 585 
trisecting acute, 21-22 
Arrows. See Equivalence classes of 
arrows 
Associative functions, coset 
multiplication, 306 
Associative law of addition, 140, 232 
complex numbers and, 143, 145 
for vector space over field, 534 
Associative law of multiplication, 
140, 232 
for vector space over field, 534 
Z, and, 255 
Automorphism(s). See also Galois 
group; Groups of automorphisms 
bijective functions and, 177-178 
of commutative rings, 160-163, 
177-182, 374 
complex conjugation and, 288 
conditions for checking, 167 
exercises for, 184—187 


687 


Index 


of fields, 167 

roots of polynomials and, 163-169 

in subgroups of Galois groups, 
613-614 


B 


Basis, 551 
linearly independent set for vector, 
554 
spanning set for vector, 554 
of vectors, 551-553 
Bijective functions, 52-54 
automorphisms and, 177-178 
binary operations and, 54-56 
composition of, 52-53 
groups of, 178, 267-271 
inverse of, 55-56 
subgroups of groups of, 277-279 
Bijective homomorphisms, 602 
radical extensions and, 653 
Binary functions, 52 
Binary operations 
bijective functions and, 54-56 
functions and, 46-56 
Bisecting angles, 639-640 


C 


Calculus 
convergence in, 126 
formal derivatives in, 477 
functions and real numbers in, 106 
functions in abstract algebra 
compared to, 46-47 
Fundamental Theorem of Algebra 
and, 218 
multiple roots of polynomials 
detected in, 12-13, 477 
multivariable, 142 
partial fraction decomposition, 
10-12 
Taylor polynomials in, 517-518 
Candidates, 379 
of Galois groups, 598-599 


Index 


Cauchy sequence(s), 126 


convergence and, 126-127 
equivalence relations in, 127-128 


Cayley’s Theorem, 358-359 
Centralizer(s) 


definition of, 294 
group structure and, 295 
as subgroups, 294 


Chains of fields 


radical extensions and, 648-649 
for relative size comparisons of 
fields, 527-528 


Chessboard, moving knights around 


altered, 22—23 


Circle(s) 


constructible real numbers’ case 2, 
one line and one, 637-638 

constructible real numbers’ case 3, 
two, 638-639 

squaring, 643 


Circuits 


parallel, 151 
series, 152 


Class Equation 


applications of, 328-330 

equivalence classes/equivalence 
relations and, 330-331 

groups with isomorphism and, 
328-329 

mathematical proof of, 330-332 


Coefficient(s). See also Complex 


coefficients; Leading coefficient; 
Rational coefficients; Real 
coefficients; Trailing coefficient 

division of polynomials by, 
411-412 

with field, and no division 
algorithm, 449 

polynomial multiplication in 
integral domain of, 367-369 

of polynomials belonging to fields, 
445 

polynomials of degree | with, in a 
field, 383 

polynomials of degree 2 with, in a 
field, 383-384 

polynomials of degree 3 with, in a 
field, 385-386 

polynomials of degree 4 with, in a 
field, 386-387 

positive integers compared to 
polynomials with, 512-515 

quadratic formula for roots of 
polynomials with Z,,, 263 

Zn, polynomials with, 256-259 


Coloring map problem, 32-35 
Common divisors 
polynomials of degree 3 and, 460 
of unique monic polynomials, 461 
Commutative law of addition, 140, 
162-163 
complex numbers and, 143 
Commutative law of multiplication, 
140, 162 
complex numbers and, 145-146 
degrees of field extensions and, 583 
Z, and, 255 
Commutative ring(s). See also Galois 
group; Noncommutative rings; 
Zn 
additive identity and, 161-163 
automorphisms of, 160-163, 
177-182, 374 
complex numbers and, 142-147 
fields and, 140-142, 240 
groups, multiplicative inverses in, 
267 
groups of, 178 
groups with addition of, 266 
groups with multiplication of 
invertible elements in, 266—267 
mathematical proof short cut for, 
163 
multiplication and, 233 
multiplicative identity and, 145, 
161-163 
rings compared to, 233, 370 
splitting fields and, 611-612 
vector space over field and, 537 
with zero divisors, 257-258 
without zero divisors, 366 
Commutators, 344-345 
Compass. See also Ruler and compass 
construction 
defining, 623 
dropping perpendicular line from 
point off line with, 627-629 
perpendicular line constructed from 
point on line with, 625-627 
radius preservation with, 625 
rigid, 625 
sequence of allowable moves on 
constructible points, preserving 
radius with, 625-632 
xy-plane, sequence of allowable 
moves for, 623-624 
Completely factored polynomials, 440 
as product of irreducible 
polynomials, 442 
“Completing the square,” 413 


688 


Complex coefficients 


of polynomials and Fundamental 
Theorem of Algebra, 216-217 

polynomials of degree 1 and, 384 

polynomials with, 215-216, 
384-385 


Complex conjugation 


automorphisms and, 288 
conditions satisfied by, 155-156 
definition of, 154-155 
Mathematical Induction and, 156 
roots of polynomials with real 
coefficients and, 158-160 


Complex number(s), 5, 97. See also 


Roots of complex numbers 

addition in, 138 

additive identity and, 144 

associative law of addition and, 
143, 145 

challenges facing acceptance for, 
139-140 

commutative law of addition and, 
143 

commutative law of multiplication 
and, 145-146 

commutative rings and, 142-147 

DeMoivre’s Theorem, 
multiplication in polar form of, 
209-210 

distributive laws and, 146-147 

electrical circuits and 
multiplication of, 151-154 

as equivalence classes of arrows, 
192-195 

equivalence classes of arrows and 
exercises with, 203-206 

equivalence classes of arrows and 
matrix multiplication for, 
221-222 

equivalence classes of arrows 
associated with, 194 

exercises for, 148-154 

fields and, 147 

geometric representation of, 
189-199 

imaginary numbers terminology 
for, 139 

introducing, 137-138 

length of, 195-196 

multiple equivalence classes and 
multiple, 194-196 

multiplication and, 138, 151 

multiplication and geometrically, 
199-202 


multiplicative identity and, 143 

polar form and, 199-202 

questions regarding, 139 

real numbers as basis for 
understanding, 113, 138, 
145-147 

rectangular form and, 199-202 

roots of polynomials and, 138 


sum of squares of two integers and, 


198-199 
triangle inequality and, 196-197 
Complex variables, 215 
Composite, 71 
Computer science, positive integers 
in, 517 
Congruence classes, 228 
Congruent modulo n, 228 
Conjugates of g, 342 
Conjugation. See Complex 
conjugation 
Constructible points 
constructible real numbers, 
623-639 
radius from distance between two 
constructible points as third, 
630-632 
sequence of allowable moves by 
compass preserving radius on, 
625-632 
Constructible real numbers 
basic constructions for, 632-634 
case 1, two lines intersecting and, 
636-637 
case 2, one line and one circle, 
637-638 
case 3, two circles, 638-639 
constructible points and, 623-639 
as field, 634-635 
field extensions and, 635-636 
Continuity, 75-76 
definition of, 108 
Continuous functions, 218 
Contradiction. See Proof by 
contradiction 
Convergence, 100, 105 
in calculus, 126 
Cauchy sequence and, 126-127 
Convergent infinite series, 100 
Corollary(ies), 37 
of division algorithm, 450-452 
Corresponding homogeneous system 
of linear equations, 566-567 
Coset multiplication, 326-327 
associative functions and, 306 
defining, 303-304 


Kronecker’s Theorem and, 
663-664, 666 
normal subgroups and, 304-306 
Cosets 
addition of, 303 
groups of, 322 
right/left, 300-303 
Cosine 
as algebraic number, 431-432 
Rational Root Test and, 425-426 
rational values of, 423, 428-429 
roots of polynomials and, 432 
Creativity, 19-20. See also Intuition 
for solutions, 23 
Cubes, doubling of, 643 
Cubic polynomials, 416, 487 
6 variable found with, 418 
table of values produced by, 
496-497 
Cycles, transpositions and, 354. See 
also Disjoint cycle(s); Disjoint 
p-cycles; p-cycles; 3-cycle 
Cyclic groups, 274-276. See also 
Finite cyclic groups 
abelian groups and, 277 
Cyclic subgroups, 274-276, 356 
examples of, 275 


positive integers and Well Ordering 


Principle for, 276 
D 


Data, polynomial recognition using, 
8-10 
Data points 
lines and, 497 
polynomials and n, 497-498 
Decimal expansion, 98-99 
addition/multiplication for, 125 
mathematical proof for real 
numbers as quotient of integers 
with, 101-105 
understanding, 99-100 
Decimals, Latin meaning of, 114 
Degrees, as dimensions, 573 
Degrees of field extensions, 573 
commutative law of multiplication 
and, 583 
computing, 577 
division algorithm for, 577-578 
examples of, 586-589 
finding, 577-580 
finite sets and, 594-595 
Galois groups and, 597-599 
irreducible polynomials and, 
577-578 


689 


Index 


linearly independent sets and, 
584-585 
minimum polynomials and, 580, 
595 
nonzero elements and, 583-584 
smallest, 581 
spanning sets and, 582-583 
splitting fields and, 601 
DeMoivre’s Theorem, 431, 577, 642 
exercises using, 433-436 
formulas derived from, 225 
multiplication of complex numbers 
in polar form and, 209-210 
radical extensions and, 648 
roots of complex numbers and, 
208-214 
Derivatives, 491. See also Formal 
derivatives 
Difference functions, 488. See also 
First difference function; nth 
difference function 
derivatives and, 491 
examples of, 488-489 
linear equations solved with, 
500-503 
Mathematical Induction formulas 
and, 501-503 
Dihedral groups, 283-286, 290 
as solvable groups, 327-328 
Dimension(s), 528. See also Finite 
dimensional vector spaces 
degree as, 573 
infinite, 554-555 
linear algebra, concept of, 528 
multiplication in weaker form and, 
531-532 
relative size and, 528 
subtleties of, 531 
of vector space over field, 551 
vector space over field, relative size 
reflected by, 585 
of vector space over field compared 
to subspace, 561 
of vectors, 551-553 
Direct product, 316 
Disjoint cycle(s) 
commutative nature of, 349-350 
division algorithm and, 350 
S, and, 348-351 
Disjoint p-cycles, 355-356 
Distributive law(s), 140 
complex numbers and, 146-147 
formal derivatives and, 478 


Index 


Distributive law(s) (continued) 
for polynomial multiplication, 
365-366 
of rings, 664 
in Z,,, 232 
Divisibility 
integers and, 61-62 
of polynomials, 437-438 
tests, 244—247 
Division, 67. See also Greatest 
common divisors 
of negative integers, 68 
of polynomial by its coefficient, 
411-412 
polynomials and long, 449-450 
of positive integers, 68-71 
Division algorithm, 64, 91, 101-103 
applications of, 449-450 
coefficients in field and no, 449 
corollary of, 450-452 
for degrees of field extensions, 
577-578 
disjoint cycles and, 350 
Eculidean Algorithm, applying, 
82-83, 463 
finite remainders in, 98 
intuition for, 68-69 
mathematical proof of, 69-71 
for partial fraction decomposition, 
510 
partial fraction decomposition and 
polynomial properties with, 
512-515 
for polynomials with nonzero 
elements, 445-449 
for polynomials with p(x) base, 
518-520 
theorem of, 68 
uniqueness of, 449 
Well Ordering Principle and, 69 
Domino analogy (Mathematical 
Induction), 31-32 
Doubling of the cube, 643 


E 


Eisenstein’s Criterion, 458, 658-659 
examples of, 395 
irreducible polynomials and, 589, 

605, 608 

mathematical proof for, 396-397 
polynomials and, 395-396, 401 
prime numbers and, 394-395 
roots of polynomials and, 420 
splitting fields and, 605 

Electrical circuits, 151-154 


Element(s). See also Invertible 
elements; Nonzero elements 
additive inverse of, 55 
of Galois groups, 598-599 
identity, 54—55 
linearly independent sets, 
replacing, 550 
multiplicative identity of field and, 
403 
spanning sets, replacing, 550 
of symmetric groups, 348-352 
-primes, 76 
Equilateral triangles, 281-283 
Equivalence classes, 123 
Class Equation and, 330-331 
of equivalence relations, 123-124 
infinite names in, 667 
Lagrange’s Theorem and, 300 
multiple complex numbers and 
multiple, 194-196 
Sylow’s Theorem and, 314-315 
of Z,, 228-229 
Z, and changing names of, 
230-231 
Equivalence classes of arrows 
addition of, 193 
complex numbers and exercises for, 
203-206 
complex numbers and matrix 
multiplication for, 221-222 
complex numbers as, 192-195 
complex numbers associated with, 
194 
real numbers as, 190-191 
Equivalence relations 
in Cauchy sequences, 127-128 
Class Equation and, 330-331 
equivalence classes of, 123-124 
Lagrange’s Theorem and, 300 
ordered pairs and, 122 
rational numbers, 118-128 
real numbers and, 118-128, 190 
reflexive/symmetric/transitive 
property examples of, 122-123 
Sylow’s Theorem and, 314-315 
Euclidean Algorithm, 64, 79, 91, 437, 
512 
division algorithm applied to, 
82-83, 463 
easy application/programming of, 
83-84 
examples of, 79-82 
formal argument for, 82-83 
in F[x], 462-466 


690 


greatest common divisors found 
with, 79-91, 460-470 
irreducible polynomials and, 672 
for rational numbers in partial 
fraction decomposition, 514-515 
Euler ¢ function, 681 
formula for, 249 
prime factorization and, 254 
prime numbers and, 248-253, 251t 
Existence of a root, 4-5 
Existence of prime factorization, 66, 
91 
Extreme Value Theorem, 219-220 
Fundamental Theorem of Algebra 
and, 218 


F 


Factor groups, 322 

as abelian groups, 325, 341 

difficulty of understanding, 
322-323 

examples of, 323-324 

isomorphism and, 324 

Field(s). See also Chains of fields; 

Splitting field(s); Vector space 
over field; Z,, 

automorphisms of, 167—168 

chain of fields for comparing 
relative size of, 527-528 

closed under multiplication, 
574-575 

coefficients and no division 
algorithm with, 449 

coefficients of polynomials 
belonging to, 445 

commutative rings and, 140-142, 
240 

complex numbers and, 147 

constructible real numbers as, 
634-635 

constructing smallest, 583 

definition of, 141 

examples of, 574-577 

finite extensions and, 671 

greatest common divisors of two 
polynomials in larger, 476-477 

groups of, 179 

groups with addition of, 266 

importance of understanding, 182 

multiple roots in, 479-481 

multiplicative identity for element 
of, 403 

multiplicative inverses and, 574 

polynomials of degree | with 
coefficients in a, 383 


polynomials of degree 2 with 
coefficients in a, 383-384 

polynomials of degree 3 with 
coefficients in a, 385-386 

polynomials of degree 4 with 
coefficients in a, 386-387 

positive integers compared to 
polynomials with coefficients in, 
512-515 

as radical extensions, 647 

rings and, 671 

roots of polynomials in, 257 

set axioms for, 574 

spanning sets and, 575-576 

Z, as, 241 


Field extension(s), 14. See also 


Degrees of field extensions 

abstract algebra and complexity of, 
580 

constructible real numbers and, 
635-636 

finite, 596-597 

Galois groups of, 573 

radical, 645-657 

simple, 594-598 


Fifth-degree polynomials, 265 
Finite cyclic groups 


isomorphism and, 316-317 
Lagrange’s Theorem and, 317 


Finite dimensional vector spaces, 584 
Finite extensions, 596-597 


fields and, 671 


Finite group theory 


mathematical proofs in, 339-341 
Sylow’s Theorem and, 312 
symmetric groups in, 279, 347, 357 


Finite groups 


isomorphism and, 288 

Lagrange’s Theorem and, 316-318, 
324 

as solvable groups, 336-337 

structure of, 265, 288 

subgroups of, 294, 296-299, 311 

Sylow’s Theorem and, 316-318 


Finite sets 


degrees of field extensions and, 
594-595 
rings as, 236-237 


Finite subgroups, 337-338 
First difference function, 9, 488 


definition of, 489-490 
of polynomials, 491 


Formal derivatives 


in calculus, 477 
definition of, 477-478 


distributive law and, 478 
multiple roots and, 474—483 
product rule and, 478-479, 481 


Formula(s). See also Quadratic 


formula 

from DeMoivre’s Theorem, 225 

difference functions for 
Mathematical Induction, 
501-503 

for Euler ¢@ function, 249 

Mathematical Induction finding, 
42-45, 499-503 

Mathematical Induction verifying, 
37-42, 487-488, 499 


Fractions. See Partial fraction(s) 
Functions. See also Bijective 


functions; Difference functions; 
Injective functions; Rational 
functions; Surjective functions 

in abstract algebra, 46 

associative, 306 

associative composition of, 
50-51, 55 

bijective, 52-54 

binary, 52 

binary operations and, 46-56 

calculus, real numbers and, 106 

in calculus compared to abstract 
algebra, 46-47 

continuous, 218 

of Galois groups, 604 

injective, 48 

properties of, 46-50 

of real variables, 215 

repeat values and, 47-48 

as rule, 46 

surjective, 48-49 


Fundamental Theorem of Algebra, 5, 


451 

calculus and, 218 

Extreme Value Theorem and, 218 

insolvability of the quintic and, 663 

Intermediate Value Theorem 
compared to, 220-221 

intuition for, 215, 218-219 

mathematical proofs of, 113, 
219-220, 573 

in partial fraction decomposition, 
512 

polynomials with complex 
coefficients and, 216-217 

roots of complex numbers and, 220 

variations of, 476 


691 


Index 


G 


Galois, 3-4, 14. See also Insolvability 
of the quintic 
consequences of work by, 418-419 
insolvability of polynomials by 
radicals by, 655-660 
polynomials of degree 5 and work 
of, 420 
relationships between, 610-611 
solvable groups and work on 
insolvability of quintic of, 325 
special case in work of, 419-420 
Galois group(s) 
abelian, 606-607 
automorphisms in subgroups of, 
613-614 
candidates of, 598-599 
degrees of field extensions and, 
597-599 
elements of, 598-599 
examples of, 181 
exercises for, 182-184 
of field extensions, 573 
functions of, 604 
mathematical proof of, 179-181 
properties of, 179 
of radical extensions, 645-657 
of radical extensions and 
Insolvability of the quintic, 
651-654 
as solvable group, 325 
of splitting field as solvable group, 
608-610 
of splitting fields, 599-615 
of splitting fields and abelian 
groups, 607-608 
structure of, 603-604 
Galois Theorem, 4, 14, 573 
splitting fields and, 614 
Gauss’ Lemma, 469, 682 
examples using, 393-394 
intuition for, 390-391 
mathematical proof for, 391-393 
reducibility tests combined with, 
394 
Geometric series, 100, 114 
Geometry 
addition of real numbers in, 
190-191 
complex numbers and 
multiplication viewed in, 
199-202 
complex numbers/real numbers 
represented in, 189-199 


Index 


Geometry (continued) 
nonabelian subgroups of S,, viewed 
in, 279-285 
ruler/compass construction and, 
6-7 
Goldbach’s conjecture (prime 
numbers), 62 
Greatest common divisors, 76 
computing, 78-79 
Euclidean Algorithm finding, 
79-91, 460-470 
in F[x], 462-466 
multiple roots of polynomials 
algorithm with, 477-479 
of nonzero polynomials, 460 
of polynomials moving to larger 
fields, 476-477 
prime factorization and, 78 


invertible elements in commutative 
rings under multiplication as, 
266-267 

Invertible elements of rings, under 
multiplication, as, 238-239 

isomorphism of, 287—288, 674 

Isomorphism Theorem for, 656, 
675 

mathematical proofs and, 297-299 

non commutative, 181-182 

nonabelian, 270-271 

nonempty subsets and, 272 

subgroups of bijective function, 
277-279 

subsets of, 271-274 

two “same,” 286-287 


Group theory, 4, 14 


Sylow’s Theorem and, 312 


Indirect proof, 27 
Infinite dimensional, 554-555 
Infinite series. See also Convergent 
infinite series 
prime numbers and, 63 
real numbers as, 100 
Injective functions, 48 
composition of, 52-53 
homomorphisms and groups with, 
309-311 
mathematical proof challenges for, 
52 
Injective homomorphisms, 601, 611 
Insolvability of polynomials by 
radicals, 655-660 
Insolvability of the quintic, 4, 14-15, 
182, 411, 453, 610, 645-683 
Fundamental Theorem of Algebra 


Groups of automorphisms, of and, 663 
commutative rings, 177-182 Galois groups of radical extensions 
and, 651-654 
groups in, 56 
mathematical proof of, 657-660 
equations, 564 radical extensions and, 646 
corresponding, 566-567 solvable groups and Galois’ work 
unknowns and, 564-565 on, 325 
Homomorphism(s) Integer solutions, 397 


questions regarding, 77 
unique factorization of polynomials 
for, 470 H 
uniqueness of prime factorization 
and, 76-79 
Green, B., 242-243 
Group(s), 56. See also Abelian 
groups; Cyclic groups; Factor 


Homogeneous system of linear 


groups; Finite groups; Solvable 
group(s); Subgroup(s); 
Symmetric group(s) 

of bijective functions, 178, 
267-271 

centralizers showing structure of, 


bijective, 602, 653 

of groups, 610-611 

groups with injective functions and, 
309-311 

injective, 601, 611 

isomorphisms compared to, 307 


Integers. See also Negative integers; 
Positive integer(s); Sum of 
squares of two integers 

divisibility and, 61-62 

importance of, 61 

polynomials, similarities to, 29, 61, 


295 normal subgroups and, 307-309 437-444 
of rings, 370-374 prime numbers and, 61-64 
real numbers as quotients of, 


Class Equation and isomorphism 
in, 328-329 
of commutative rings, 178 I 98-100 
commutative rings under addition TIdeal(s), 370, 378 table of values and subsets of, 491 
as, 266 Kronecker’s Theorem and, uniqueness of prime factorization 
664-665 and, 64-67 
Identity element, 54-55 Integers modulo n. See Z,, 
of groups, 271 Integral combinations, 79-84 
Identity map, 53-54 Integral domain, polynomial 
Imaginary numbers, complex multiplication of coefficients in, 
numbers as, 139 367-369 
Imagination. See Creativity Intermediate Value Theorem, 423 
Impedance, 151 Fundamental Theorem of Algebra 
of parallel circuit, 152-154 compared to, 220-221 
of series circuit, 152 Least Upper Bound Property 
Impossibility of trisecting angles, 639 proving, 107-108 
approach for, 623 mathematical proof of, 108-110 
Rational Root Test and, 642 Rational Root Test and, 380 
with ruler and compass real numbers compared to rational 
construction, 640-643 numbers using, 105-106 
Including multiplicities, 475 roots of complex numbers and, 211 


commutative rings with 
multiplicative inverses as, 267 

of cosets, 322 

definition of, 265-266 

dihedral, 283-286, 290 

direct product and, 316 

of fields, 179 

fields under addition as, 266 

with finite subgroups, 337-338 

homomorphism of, 610-611 

homomorphisms of, with injective 
functions, 309-311 

identity element of, 271 

importance of understanding, 182 

in insolvability of the quintic, 56 


692 


roots of polynomials found with, 
110-113, 420 
Intuition, 25 
for division algorithm, 68-69 
for Fundamental Theorem of 
Algebra, 215, 218-219 
for Gauss’ Lemma, 390-391 
Mathematical Induction and, 35 
mathematical proofs and, 19-20 
rational numbers and, 125-126 
triangle inequality and, 197 
Invertible elements 
groups with commutative rings 
under multiplication and, 
266-267 
of rings as group under 
multiplication, 238-239 
rings with zero divisors or, 
234-237 
in Z,, 234, 237, 239-240, 248 
Irrational numbers, 29 
uniqueness of prime factorization, 
positive integers and, 90-91 
Irreducible polynomials, 439 
completely factored polynomials as 
product of, 442 
degrees of field extensions and, 
577-578 
Eisenstein’s Criterion and, 589, 
605, 608 
Euclidean Algorithm and, 672 
in F[x], 466-467 
infinite, 443-444 
minimum polynomials and, 
457-460 
partial fraction denominator into, 
510 
with rational coefficients and no 
multiple roots, 482-483 
relatively prime, 466 
unique factorization theorem and, 
439-440, 467-468 
Irreducible quadratic polynomials, 
452-453 
linear functions and, 579 
splitting fields and, 600 
unique factorization theorem and, 
579 
Isomorphism(s) 
abelian groups and, 288 
Class Equation and groups with, 
328-329 
factor groups and, 324 
finite cyclic groups and, 316-317 
finite groups and, 288 


of groups, 287-288, 674 
homomorphisms compared to, 307 
Isomorphism Theorem 
for groups, 656, 675 
mathematical proof of, 332-334 
for rings, 674-677 
Isosceles triangles, nonabelian 
subgroups of S,, and, 279-281 


K 


Kronecker’s Theorem, 370 
coset multiplication and, 663-664, 
666 
definition of, 664 
examples of, 664-665, 668-671 
ideals in, 664-665 
mathematical proof of, 673-674 


L 


Lagrange’s Theorem, 329-330, 
336-337, 339, 341, 364 
converse of, 359 
equivalence relations/equivalence 
classes and, 300 
finite cyclic groups and, 317 
finite groups and, 316-318, 324 
mathematical proof of, 300 
right/left cosets and, 300-303 
subgroups and, 300 
Leading coefficient, 366-367, 369 
Rational Root Test and, 382-383 
Least common multiple, 94 
Least Upper Bound Property 
exercises for finding, 115-117 
Intermediate Value Theorem 
proven with, 107-108 
observations on, 107 
real numbers compared to rational 
numbers and, 107 
Left cosets, 300-303. See also Cosets 
Lemma(s), 37. See also Gauss’ 
Lemma 
for proving Sylow’s Theorem, 
312-315 
Line(s). See also Parallel line; 
Perpendicular line 
constructible real numbers’ case 1, 
intersection of two, 636-637 
constructible real numbers’ case 2, 
one circle and one, 637-638 
data points and, 497 
Linear algebra 
dimension in, 528 
polynomial examples in, 528-533 


693 


Index 


Linear equations. See also 
Homogeneous system of linear 
equations 

algebra, systems of, 6 

difference functions for solving, 
500-503 

partial fraction decomposition, 
solution for, 511 

solving, 5-6, 498-499 

subspaces and, 563 

system of, 564 

vectors and, 540-541 

Linear functions 

irreducible quadratic polynomials 
and, 579 

in precalculus, 487 

table of values and, 496 

Linear polynomials, 452-453 

Linear transformation, 569 

Linearly dependent sets 

definition of, 545 

vectors and, 546-547 

Linearly independent sets 

as basis of vectors, 554 

definition of, 545 

degrees of field extensions and, 
584-585 

elements replaced in, 550 

as spanning sets, 553-554 

spanning sets and, 546-548 

spanning sets compared to, 553 

vector subsets, relative size of 
spanning sets and, 549-550 

vectors and, 546-547 

Long division, of polynomials, 
449-450 


M 


Machinery, 12-13 
Map, coloring problem, 32-35 
Mathematical Induction, 10, 254, 467 
complex conjugation and, 156 
difference functions for formulas 
of, 501-503 
domino example/analogy for, 
31-32 
exercises for, 41-45 
first version, 30-32 
first version, revisited, 32-36 
formulas, finding with, 42-45, 
499-503 
formulas, verifying with, 37-42, 
487-488, 499 
intuition and, 35 


Index 


Mathematical Induction (continued) 


Partial fraction decomposition, 
denominator factors applying, 
514-515 

polynomials and, 499-503 

for polynomials with nonzero 
elements, 447 

Rational Root Test using, 427-428 

satisfying both parts of, 43-45 

second version, 37, 447-448, 519 

Two-color problem for planes 
divided up by lines using, 32-35 

uniqueness of prime factorization 
and, 87 

from Well Ordering Principle, 
29-30 

Well Ordering Principle compared 
to, 30-31 


Mathematical proof(s). See also 


Mathematical Induction 

Class Equation, 330-332 

commutative rings, short cut for, 
163 

of division algorithm, 69-71 

for Eisenstein’s Criterion, 396-397 

exercises for, 37-41 

in finite group theory, 339-341 

of Fundamental Theorem of 
Algebra, 113, 219-220, 573 

of Galois group, 179-181 

for Gauss’ Lemma, 391-393 

groups and, 297-299 

for impossibilities, 645 

indirect, 27 

injective/surjective functions 
challenges with, 52 

of insolvability of the quintic, 
657-660 

of Intermediate Value Theorem, 
108-110 

intuition and, 19-20 

of Isomorphism Theorem, 332-334 

of Kronecker’s Theorem, 673-674 

of Lagrange’s Theorem, 300 

for partial fraction decomposition, 
520-521 

purpose of, 19, 27 

of Rational Root Test, 381-382 

for real numbers with decimal 
expansion as quotient of 
integers, 101-105 

of Sylow’s Theorem, 315-316 

for unique factorization theorem, 
442-444 


of uniqueness of prime 
factorization, 84-85, 87-88 
Well Ordering Principle and, 20-26 
Mathematical statements. See also 
Well Ordering Principle 
proof by contradiction and, 26-27 
true/false, 24 
Mathematics 
creativity and, 19-20 
understanding new concepts of, 
75-16 
Matrix multiplication, 146-147 
complex numbers as equivalence 
classes of arrows and, 221-222 
Mean Value Theorem, 10 
Minimum polynomials, 457 
degrees of field extensions and, 
580, 595 
irreducible polynomials and, 
457-460 
for a over F, 457, 460 
unique factorization of polynomials 
finding, 468-470 
Modulo n. See Z, 
Monic divisors, 460 
Monic polynomials, 169-170, 262, 
460. See also Unique monic 
polynomials 
in F[x], 466-467 
infinite, 442-444 
irreducible, 409 
Rational Root Test and, 426 
relatively prime, 466 
unique factorization theorem and, 
440 
Motions. See Rigid motions 
Moving knights around altered 
chessboard, 22—23 
Multiple roots 
counting, 475 
in field, 479-481 
formal derivatives and, 474-483 
irreducible polynomials with 
coefficients and no, 482-483 
Multiple roots of polynomials 
calculus, detecting, 12-13, 477 
greatest common divisors 
algorithm for, 477-479 
Multiplication. See also Associative 
law of multiplication; 
Commutative law of 
multiplication; Coset 
multiplication; Matrix 
multiplication; Polynomial 


694 


multiplication; Scalar 
multiplication 
associative law of, 140 
in basic arithmetic, 125 
commutative law of, 140 
commutative rings and, 233 
complex numbers and, 138, 151 
complex numbers geometrically 
viewing, 199-202 
for decimal expansion, 125 
DeMoivre’s Theorem, complex 
numbers in polar form and, 
209-210 
dimension and weaker form of, 
531-532 
electrical circuits, complex 
numbers and, 151-154 
fields closed under, 574-575 
groups with invertible elements in 
commutative rings under, 
266-267 
Invertible elements of rings as 
group under, 238-239 
of ordered pairs, 531-532 
of rational numbers, 118-122 
of rational numbers as well defined, 
120-121 
in Z,, 229-232, 235 
Multiplicative identity, 140 
commutative rings and, 145, 
161-163 
complex numbers and, 143 
for element of field, 403-404 
sum of copies of, 478 
Multiplicative inverses, 140 
fields and, 574 
groups, commutative rings with, 
267 
of nonzero elements, 439 
polynomials and, 142 
Multivariable calculus, 142 


N 


Natural numbers, 24 

prime numbers as building blocks 
of, 348-349, 353 

Negative integers, division of, 68 

n-gons, rigid motions of, 283 

Nonabelian groups, 270-271 

Nonabelian subgroups of S,, 
equilateral triangles and, 281—283 
geometry approach to, 279-285 
isosceles triangles and, 279-281 

Noncommutative rings, 146-147 

Nonempty subsets, 272 


Nonzero constant 
nth difference function, with 
polynomial degree n, as, 
493-495 


nth difference function as, 490-491 


Nonzero elements 
degrees of field extensions and, 
583-584 
division algorithm for polynomials 
with, 445-449 
Mathematical Induction for 
polynomials with, 447 
multiplicative inverses, 439 
Nonzero polynomials 
greatest common divisors of, 460 
primitive, 409 
Normal subgroups, 303-304 
coset multiplication and, 304-306 
homomorphisms and, 307-309 
solvable groups and, 334-336, 
338-339 
Normalizers, 345 
of subgroups, 364 
nth cyclotomic polynomials, 681 
nth difference function, 488 
as nonzero constant, 490-491 
as nonzero constant with 
polynomial degree n, 493-495 
Number system(s), 140 


O 


Ohms, 151 

Ordered pairs 
addition of, 529-530 
equivalence relations and, 122 
multiplication of, 531-532 
rational numbers as, 118—122 
relative size of, 531 


P 


gt, as transcendental number, 643 
Parallel circuits, 151 
examples of, 152 
impedance of, 152-154 
Parallel line, 629-630 
Partial fraction(s), 11 
calculus and decomposition of, 
10-12 
division algorithm for 
decomposition of, 510 
Fundamental Theorem of Algebra 
in decomposition of, 512 
irreducible polynomials from 
denominator in, 510 


linear equation solution for 
decomposition of, 511 
Mathematical Induction on 
denominator factors for 
decomposition of, 514-515 
mathematical proof for 
decomposition of, 520-521 
polynomial properties from 
division algorithm in 
decomposition of, 512-515 
prime factorization in 
decomposition of, 513-514 
rational function, decomposition 
of, 510-523 
rational numbers using Euclidean 
Algorithm for decomposition of, 
514-515 
types of, 11, 512 
p-cycles, 355-356 
Perpendicular line 
compass constructing, from point 
on line, 625-627 
compass from point off line 
dropping, 627-629 
Polar form, 200 
complex numbers and, 199-202 
DeMoivre’s Theorem, 
multiplication of complex 
numbers in, 209-210 
inverses in, 209 
rectangular form, converting 
between, 201-202 
Polynomial(s), 262. See also 
Completely factored 


polynomials; Cubic polynomials; 


Fifth-degree polynomials; 
Irreducible polynomials; 
Minimum polynomials; Monic 
polynomials; Roots of 
polynomials; Unique 
factorization of polynomials 

additive identity in, 530 

additive inverses with, 530 

algebra and, 2-6 

in base x, 517-518 

calculus, detecting multiple roots 
of, 12-13 

with coefficients in Z,, 256-259 

coefficients of, belonging to fields, 
445 


with complex coefficients, 215-216 


with complex coefficients and 
Fundamental Theorem of 
Algebra, 216-217 

divisibility of, 437-438 


695 


Index 


division algorithm for nonzero 
elements in, 445-449 

division algorithm for p(x) base, 
518-520 

division by coefficient of, 
411-412 

Eisenstein’s Criterion and, 
395-396, 401 

existence of roots of, 4-5 

finding roots of, 2-4 

first difference function of, 491 

Galois’ work on insolvability of 
radicals and, 655-660 

greatest common divisors in larger 
fields of, 476-477 

important facts about, 437 

integers, similarities to, 29, 61, 
437-444 

integrating, 12 

linear, 452-453 

linear algebra examples of, 
528-533 

long division of, 449-450 

Mathematical Induction and, 
499-503 

monic divisors of, 460 

multiplicative inverses and, 142 

n data points and, 497-498 

with nonzero elements, using 
Mathematical Induction, 447 

nth cyclotomic, 681 

nth difference function as nonzero 
constant with degree n, 493-495 

partial fraction decomposition, 
division algorithm for properties 
of, 512-515 

positive integers compared to, with 
coefficients in field, 512-515 

precalculus, data used for 
recognizing, 8-10 

prime numbers of, 242-243 

primitive nonzero, 409 

quadratic, 261-262 

quartic, 419 

radical extensions and, 647-649 

Rational Root Test and, 401 

with real coefficients/complex 
coefficients, 384-385 

reducibility tests and, 398-400, 
408-409 

reducible/irreducible, 390 

rings, 595 

of smallest degree, 451-452 

table of values produced by, 487 

Taylor, 517-518 


Index 


Polynomial multiplication 
with coefficients in integral 
domain, 367-369 
distributive law for, 365-366 
Polynomials of degree 1 
with coefficients in field, 383 
complex coefficients and, 384 
as irreducible, 383 
roots of, 412 
Polynomials of degree 2 
with coefficients in field, 383-384 
as irreducible, 383 
roots of, 412-414 
Polynomials of degree 3 
with coefficients in a field, 385-386 
common divisors and, 460 
quadratic formula for, 413-415 
Rational Root Test for, 386 
roots of, 414-415 
Polynomials of degree 4 
with coefficients in a field, 386-387 
quadratic formula for, 413, 
416-417 
as reducible, 394 
roots of, 415-418 
Polynomials of degree 5 
Galois’ work on, 420 
insolvability by radical extensions 
of, 655-660 
roots of, 418-419 
Polynomials of degree 7, 660 
Positive integer(s), 24 
as composite, 71 
computation in base 10 of, 516 
in computer science, 517 
cyclic subgroups and Well 
Ordering Principle for, 276 
division of, 68-71 
polynomials with coefficients in 
field compared to, 512-515 
as prime numbers, 91 
uniqueness of prime factorization, 
irrational numbers and, 90-91 
uniqueness of prime factorization, 
rational numbers and, 89-90 
Precalculus 
linear functions in, 487 
polynomial recognition using data 
in, 8-10 
Prime factorization, 63-64, 437. See 
also Uniqueness of prime 
factorization 
Euler ¢ function and, 254 
exercises for, 72-73 
existence of, 66, 91 


greatest common divisors and, 78 
in partial fraction decomposition, 
513-514 
uniqueness of, 64—67 
Well Ordering Principle and, 65-66 
Prime number(s), 62, 91. See also 
Twin primes 
Eisenstein’s Criterion and, 394-395 
Euler ¢ function and, 248-253, 251t 
exploiting properties of, 248 
gaps between, 74 
Goldbach’s conjecture for, 62 
infinite number of, 66-67, 74-75 
infinite series and, 63 
integers and, 61-64 
natural numbers from, 348-349, 
353 
of polynomials, 242-243 
positive integers as, 91 
product of, 64 
questions regarding, 62-63 
relatively, 84 
twin primes conjecture for, 63 
Z, and, 240 
Primitive nonzero polynomials, 409 
Primitive nth root, 260, 681 
Product of primes, 64 
Product rule, formal derivatives and, 
478-479, 481 
Proof by contradiction, 26 
avoiding, 29-30 
dangers of using, 27 
mathematical statements and, 
26-27 
Well Ordering Principle used with, 
27-29 
Proofs. See Mathematical proof(s) 
Proposition, 37. See also 
Mathematical statements 
Pythagorean Theorem, 196, 423 
trigonometry and, 424 


Q 


6 variable, cubic polynomials finding, 
418 
Quadratic formula, 3, 600, 609 
for polynomials of degree 3, 
413-415 
for polynomials of degree 4, 413, 
416-417 
roots of polynomials found with, 
149 
for roots of polynomials with 
coefficients in Z,,, 263 
splitting fields using, 600 


696 


Quadratic polynomials, 261-262, 
393 
irreducible, 452-453 
product of two, 401-402 
table of values produced by, 
494-495 
Quartic polynomials, 419 
Quotient, 68 
of integers, mathematical proof for 
real numbers with decimal 
expansion as, 101-105 
of integers, real numbers as, 
98-100 
Quotient groups. See Factor groups 
Quotient rings, 370 


R 


Radian measure, 7 
trigonometric functions and, 424 
Radical extensions 
bijective homomorphisms and, 653 
chains of fields and, 648-649 
DeMoivre’s Theorem and, 648 
fields as, 647 
Galois groups of, 645-657 
Galois’ work on insolvability of 
polynomials by, 655-660 
insolvability of the quintic and, 646 
insolvability of the quintic and 
Galois groups of, 651-654 
polynomials and, 647-649 
polynomials of degree 5, 
insolvability by, 655-660 
splitting fields and, 653 
Radius 
compass and preserving, 625 
at constructible point from distance 
between two others, 630-632 
sequence of allowable moves, 
compass on constructible points, 
preserving, 625-632 
Rational coefficients 
irreducible polynomials, no 
multiple roots, with, 482-483 
rational roots of polynomials with, 
379 
Rational functions 
addition of, 470 
decomposing, 515-517 
integrating, 11 
partial fraction decomposition of, 
510-523 
Rational number(s). See also 
Irrational numbers 


addition/multiplication as well 
defined for, 120-121 
addition/multiplication of, 118-122 
approaches to, 97, 118, 125-126 
convergent infinite series of, 100 
equivalence relations and, 118-128 
Intermediate Value Theorem 
comparing real numbers to, 
105-106 
intuition and, 125-126 
Least Upper Bound Property, real 
numbers compared to, 107 
in lowest terms, 25 
motivations for studying, 97-98 
name complications for, 119 
as ordered pairs, 118-122 
partial fraction decomposition, 
Euclidean Algorithm for, 
514-515 
real numbers as, 98-101, 126-128 
uniqueness of prime factorization, 
positive integers as, 89-90 
Rational Root Test, 399, 458, 469 
candidates for, 379 
cosine and, 425-426 
examples of, 379-381, 393-394 
impossibility of trisecting angles 
and, 642 
Intermediate Value Theorem and, 
380 
leading coefficient/trailing 
coefficient and, 382-383 
Mathematical Induction used for, 
427-428 
mathematical proof of, 381-382 
monic polynomials and, 426 
polynomials and, 401 
for polynomials of degree 3, 386 
for rational values of trigonometric 
functions, 425-427 
real numbers and, 380 
reduction modulo p and, 400 
sine and, 424-425 
Rational roots of polynomials 
of degree 1, 401 
with rational coefficients, 379 
Rational values 
of cosine, 423, 428-429 
Rational Root Test for 
trigonometric functions with, 
425-427 
of sine, 423, 429-430, 434 
of tangent, 423, 430 
of trigonometric functions, 7, 
424433 


Real coefficients 
complex conjugation for roots of 
polynomials with, 158-160 
polynomials with, 384-385 
roots of polynomials of odd degree 
with, 111-113 
Real number(s), 5. See also 
Constructible real numbers 
addition of, in geometry, 190-191 
algebraic numbers and, 431 
approaches to, 97, 118 
calculus, functions and, 106 
complex numbers’ basis in, 113, 
138, 145-147 
constructible, 6-7 
as equivalence classes of arrows, 
190-191 
equivalence relations and, 
118-128, 190 
geometric representation of, 
189-199 
as infinite series, 100 
Intermediate Value Theorem 
comparing rational numbers to, 
105-106 
Least Upper Bound Property, 
rational numbers compared to, 
107 
motivations for studying, 97-98 
as quotient of integers, 98-100 
as quotient of integers with decimal 
expansion, mathematical proof, 
101-105 
as rational numbers, 98-101, 
126-128 
Rational Root Test and, 380 
roots of polynomials and, 137 
transcendental numbers and, 431 
understanding, 99-100 
xy-plane, construction of, 624-625 
Real variables, 218 
functions of, 215 
Rectangular form, 199 
complex numbers and, 199-202 
polar form, converting between, 
201-202 
Reducibility tests 
Gauss’ Lemma combined with, 394 
polynomials and, 398-400, 
408-409 
Reduction modulo p, 398-407 
examples of, 400-401 
Rational Root Test and, 400 
warning regarding, 399 
Reflexive property, 122-123 


697 


Index 


Relative size 
dimension and, 528 
field comparisons for comparing 
fields and, 527-528 
of ordered pairs, 531 
vector space over field, dimension 
reflecting, 585 
of vector space over field and, 
548-549 
of vector subsets, linearly 
independent sets/spanning sets, 
549-550 
Relatively prime, 84 
irreducible polynomials, 466 
monic polynomials, 466 
Remainder(s), 68 
division algorithm and finite, 98 
Repeat values, 47-48 
Representation theory, 221 
Right cosets, 300-303. See also 
Cosets 
Rigid compass, 625 
Rigid motions, of n-gons, 283-286 
Ring(s), 146. See also Commutative 
ring(s) 
commutative rings compared to, 
233, 370 
distributive laws of, 664 
fields and, 671 
as finite set, 236-237 
homomorphisms of, 370-374 
invertible elements as group under 
multiplication, 238-239 
with invertible elements or zero 
divisors, 234-237 
Isomorphism Theorem for, 
674-677 
noncommutative, 146-147 
polynomial, 595 
properties of, 377-378 
quotient, 370 
Rolle’s Theorem, 658-659 
roots of polynomials and, 115, 420 
Roots, multiple. See also Multiple 
roots of polynomials 
counting, 475 
in field, 479-481 
formal derivatives and, 474-483 
irreducible polynomials with 
coefficients and no, 482-483 
Roots of complex numbers 
DeMoivre’s Theorem and, 208-214 
Fundamental Theorem of Algebra 
and, 220 


Index 


Roots of complex numbers (continued) 
Intermediate Value Theorem and, 
211 
Roots of polynomials. See also 
Multiple roots of polynomials; 
Rational roots of polynomials 
algebra for finding, 15-16 
automorphisms and, 163-169 
calculus, detecting multiple, 12-13 
with coefficients in Z,,, using 
quadratic formula, 263 
complex conjugation for real 
coefficients and, 158-160 
complex numbers and, 138 
cosine and, 432 
of degree 1, 412 
of degree 2, 412-414 
of degree 3, 414-415 
of degree 4, 415-418 
of degree 5, 418-419 
Eisenstein’s Criterion and, 420 
existence of, 4-5 
factors of, 258-259 
in fields, 257 
finding, 2-4 
Intermediate Value Theorem 
finding, 110-113, 420 
of odd degree with real coefficients, 
111-113 
quadratic formula finding, 149 
real numbers and, 137 
Rolle’s Theorem for, 115, 420 
sine and, 432 
of small degree, 411-418 
Ruler(s) 
defining, 623 
ignoring markings on, 625 
xy-plane, sequence of allowable 
moves for, 623-624 
Ruler and compass construction, 6-7 
impossibility of trisecting angles 
with, 640-643 
terminology’s importance for, 623 
trisecting angles with, 20-22 


S 

S4 
as solvable group, 360 
subgroups of, 359-360 
3-cycles in, 360 

Scalar multiplication, 535 
subspaces and, 560 

Scalars, 535 

Second differences, 9 

Sequence of allowable moves 


for compass preserving radius on 
constructible points, 625-632 
for ruler/compass on xy-plane, 
623-624 
Series circuits, 151 
impedance of, 152 
Simple field extensions, 594-598 
Sine 
as algebraic number, 431-432 
Rational Root Test and, 424-425 
rational values of, 423, 429-430, 
434 
roots of polynomials and, 432 
60° angles, 585 
S,. See also Nonabelian subgroups 
of S, 
insolvability of, 357 
transpositions and, 352-353 
transpositions in subgroup of, 
354-355 
Solvable group(s), 322-325 
abelian groups and, 332 
defining, 325-327 
dihedral groups as, 327-328 
finite groups as, 336-337 
Galois groups as, 325 
Galois groups of splitting field as, 
608-610 
Galois’ work on insolvability of 
quintic and, 325 
normal subgroups and, 334-336, 
338-339 
S4 as, 360 
symmetric groups as, 328 
Spanning sets 
as basis of vectors, 554 
degrees of field extensions and, 
582-583 
elements replaced in, 550 
fields and, 575-576 
finite, 554-555 
as linearly independent sets, 
553-554 
linearly independent sets and, 
546-548 
linearly independent sets compared 
to, 553 
of vector space over field, 542-545 
vector subsets, relative size of 
linearly independent sets and, 
549-550 
Splitting field(s), 647 
bijective homomorphisms and, 602 
commutative rings and, 611-612 
degrees of field extensions and, 601 


698 


Eisenstein’s Criterion and, 605 
Galois groups of, 599-615 
Galois groups of, and abelian 
groups, 607-608 
Galois groups of, as solvable 
group, 608-610 
Galois Theorem and, 614 
injective homomorphisms and, 601 
irreducible quadratic polynomials 
and, 600 
quadratic formula for, 600 
radical extensions and, 653 
Squaring of the circle, 643 
Statements. See Mathematical 
statements 
Straightedge, 22 
Subgroup(s). See also Finite 
subgroups; Normal subgroups 
automorphisms in Galois groups 
and, 613-614 
centralizers as, 294 
cyclic, 274-276 
of finite groups, 294, 296-299, 311 
of groups of bijective functions, 
2771-279 
Lagrange’s Theorem and, 300 
normalizer of, 364 
of S4, 359-360 
of S,,, transpositions, 354-355 
Sylow’s Theorem and, 317 
Subsets 
of groups, 271-274 
groups and nonempty, 272 
of integers and table of values, 491 
of vectors, relative size of linearly 
independent spanning sets, 
549-550 
Subspace(s) 
dimension of vector space over 
field compared to, 561 
examples of, 562 
linear equations and, 563 
scalar multiplication and, 560 
of vector space over field, 560-561 
Sum of copies of multiplicative 
identity, 478 
Sum of squares of two integers, 
198-199 
Surjective functions, 48-49 
composition of, 52-53 
mathematical proof challenges for, 
52 
Sylow’s Theorem, 265, 311-312, 
336-337, 339-340, 364 


equivalence classes/equivalence 
relations and, 314-315 
finite group theory and, 312 
finite groups and, 316-318 
group theory and, 312 
lemmas proving, 312-315 
mathematical proof of, 315-316 
subgroups and, 317 
Symmetric group(s), 269-270 
Cayley’s Theorem and, 358-359 
elements of, 348-352 
examples of, 276 
in finite group theory, 279, 347, 357 
insolvability of, 347, 357 
as solvable groups, 328 
Symmetric property, 122-123 
Symmetries, 283 
System of linear equations, 564. See 
also Homogeneous system of 
linear equations 
solutions for, 566-567 
types of, 565-566 


T 


Table of values 
cubic polynomials producing, 
496-497 
linear functions and, 496 
polynomials producing, 487 
quadratic polynomials producing, 
494-495 
subset of integers and, 491 
Tangent, 423, 430 
Tao, T., 242-243 
Taylor polynomials, 517-518 
Teachers, abstract algebra and, 13 
Theorem, 37. See also specific 
theorems 
of division algorithm, 68 
3-cycle, 356-357 
3-cycles, in Sy, 360 
Tools, 12-13 
Trailing coefficient, 366-367, 369 
Rational Root Test and, 382-383 
Transcendental numbers, 431 
7 as, 643 
Transformation, linear, 569 
Transitive property, 122-123 
Transpositions 
cycles and, 354 
S, and, 352-353 
in subgroup of S,,, 354-355 
Triangle inequality, 196 
complex numbers and, 196-197 
intuition and, 197 


Triangles 
equilateral, 281-283 
isosceles, 279-281 
Trigonometric functions 
as algebraic numbers, 433 
radian measure and, 424 
Rational Root Test for rational 
values of, 425-427 
rational values of, 7, 424-433 
Trigonometry, 7 
Pythagorean Theorem and, 424 
Trisection. See also Impossibility of 
trisecting angles 
of acute angles, 21-22 
ruler and compass construction for 
angles and, 20-22 
Twin primes 
conjecture, 63 
exercise for, 71 
Two-color problem for planes divided 
up by lines, 32-35 


U 


Unique factorization of polynomials, 
467-468 
for greatest common divisors, 470 
minimum polynomials found with, 
468-470 
Unique factorization theorem 
definition of, 439-440 
examples of, 440-442 
irreducible polynomials and, 
439-440, 467-468 
irreducible quadratic polynomials 
and, 579 
mathematical proofs for, 442-444 
monic polynomials and, 440 
Unique monic polynomials, 
457, 460 
common divisors of, 461 
of smallest degree, 461-462 
Uniqueness of prime factorization, 91 
-primes and, 76 
example of, 85-86 
fixed order for, 86 
greatest common divisors and, 
76-79 
integers and, 64-67 
Mathematical Induction and, 87 
mathematical proof of, 84-85, 
87-88 
notation for, 86 
positive integers/irrational numbers 
and, 90-91 


699 


Index 


positive integers/rational numbers 
and, 89-90 
understanding, 76 
Unknowns, 564 
homogeneous system of linear 
equations and, 564-565, 
Upper bound, 106. See also Least 
Upper Bound Property 


Vv 


Variables, 564 
complex, 215 
real, 215, 218 
Vector(s), 535 
additive identity of, 535 
basis of, 551-553 
dimensions of, 551-553 
finite dimensional, 584 
linear equations and, 540-541 
linearly dependent sets and, 
546-547 
linearly independent set as basis of, 
554 
linearly independent sets and, 
546-547 
relative size of linearly independent 
sets/spanning sets in subsets of, 
549-550 
spanning set as basis of, 554 
Vector space over field 
associative law of addition for, 534 
associative law of multiplication 
for, 534 
commutative rings and, 537 
dimension of, 551 
dimension of subspace compared 
to, 561 
dimension reflecting relative size 
of, 585 
examples fundamental for, 537 
exercises for, 538-540 
facts regarding, 535-536 
finite, 563-564, 581 
infinite dimensional, 554-555 
properties of, 533-534 
relative size of, 548-549 
sets that are not, 536-537 
spanning sets of, 542-545 
subspaces of, 560-561 


WwW 


Well Ordering Principle, 24, 336, 440, 
442, 461, 664 
approaches to, 24—25 


Index 


Well Ordering Principle (continued) 

cyclic subgroups with positive 
integers and, 276 

division algorithm and, 69 

exercises using, 41-42 

Mathematical Induction compared 
to, 30-31 

Mathematical Induction from, 
29-30 

mathematical proofs and, 20-26 

prime factorization and, 65-66 

proof by contradiction used with, 
27-29 

uses of, 25-26 


xX 
xy-plane 
constructible points on, 624 


real number construction on, 
624-625 

ruler/compass allowable moves in, 
623-624 


Z 


Zero, existence of, 139 
Zero divisor(s) 
commutative rings with, 257-258 
commutative rings without, 366 
rings with invertible elements or, 
234-237 
in Z,,, 234, 237, 259 
Zn, 323 
addition defined in, 229-232 
associative law of multiplication 
and, 255 
commutative law of multiplication 
and, 255 


700 


definition of, 227-228 

distributive laws in, 232 

equivalence classes changing 
names in, 230-231 

equivalence classes of, 228-229 

exercises for, 260-263 

as field, 241 

importance of, 227 

invertible elements in, 234, 237, 
239-240, 248 

multiplication in, 229-232, 235 

polynomials with coefficients in, 
256-259 

prime numbers and, 240 

properties of, 227-232 

quadratic formula for roots of 
polynomials with coefficients in, 
263 

zero divisors in, 234, 237, 259 


